THE  LIBRARY 


The  Ontario  Institute 


for  Studies  in  Education 


Toronto,  Canada 


;<K 


iSi°^H^ 


EDUCATION 


DfiPT.  OF.  £QMeATK)flAL  ^j^pPlfteM 


.% 


^«8rvoFiQ»^ 


.N^ 


~^i 


L  !  B  R  A  R  Y 

NOV    il    1970 


iBMkMM 


THE  MEASUREMENT  OF  MUSICAL 
DEVELOPMENT  II 


by 
Melvin  S.  Hattwick,  Ph.D. 

and 
Harold  M.  Williams,  Ph.D. 


George  D.  Stoddard,  Ph.D.,  Editor 

University  of  Iowa  Studies 

Studies  in  Child  Welfare 


VOLUME  XI,  NO.  2 

((^     EDUCATION     ^^ 
OiPT.  Of  EDOf^V^TJOfiAL  WESEHRCil 

PUBLISHED  BY  THE  UNIVERSITY 

IOWA  CITY,  IOWA 

1935 


FOREWORD 

Dr.  Hattwick's  methods  of  measurement  appear  reliable  for 
either  experimental  or  practical  use  by  the  first  year  of  school.  His 
findings  present  further  evidence  for  the  early  appearance  and 
slow  growth  of  pitch  discrimination  ability  in  children.  But  in  spite 
of  increased  true  singing  ability,  children  become  noticeably  diffi- 
dent or  resistant  by  the  fourth  grade  level. 

This  work  is  in  sequence  with  Professor  Williams'  previous  Sta- 
tion study  entitled  The  Mea^aurement  of  Musical  Development. 

George  D.  Stoddard 

Office  of  the  Uireetor 

Iowa  Child  Welfare  Eesearch  Station 

University  of  Iowa 

January  9,  1935 


CONTENTS 

Chapter  Page 

PART  ONE.     A  GENETIC  STUDY  OF  DIFFERENTIAL 

PITCH  SENSITIVITY 

Melvin  S.  Hattwick 

I.    General  Orientation  and  Scope  of  the  Investigation     ...  9 

General   Orientation     .........  9 

Scope  of  the  Study       .........  9 

II.    Experiments   in   Methodology :    The   Singing   Technique         .         .  14 

Historical   Summary     .........  14 

Problems  in  tlie  Evaluation  of  the  Interval  Singing  Technique  15 

The  Musical  Laboratoiy       ........  15 

Interval  Singing  Experiments       .......  15 

Summary   of  Interval   Singing  as  a    Technique   for   Measuring 

Pitch  Discrimination       ........  21 

III.  Experiments  in  Methodology:  The  Verbal  Concept  Technique         .  23 

Subjects    and    Method  ........  23 

Group  Testing       ....  * 24 

Individual    Tests 38 

IV.  Group  and  Individual  Test  Results 46 

Choice  of  Intervals       .........  46 

Number  of  Trials         .........  46 

Experimental  Evaluation  of  Test         ......  49 

Objective  Criteria   of   Test   Comprehension  ....  49 

Results  ............  53 

Value  of  the  Present  Group  Test  for  Children  in  Fifth  Grade 

and  Under 58 

Tentative    Norms  .........  59 

Group  Discrimination  Limens       .......  60 

Summary       ...........  61 

V.    Genetic   Growth   of   Differential   Pitch    Sensitivity         ...  63 
Normative  Summary  of  Differential  Reactions  to  Small  Pitch 

Differences  From  Four  to  Ten  Years  of  Age         ...  63 

Implications   of  Results        ........  65 

References         ...........  67 

PART   TWO.     MANUAL  OF   INSTRUCTIONS   AND   INTER- 
PRETATIONS FOR  A  PITCH  DISCRIMINATION  TEST  FOR 

YOUNG   CHILDREN 

Melvin  S.  Hattwick 
Materials  Needed     ......... 

Instructions       .......... 

Computation  and  Interpretation  of  Results     .... 

Individual  Tests  of  Pitch  Discrimination  for  Young  Cl-ildren 
Disadvantages  of  the  Pitch  Test 


71 

71 
72 
73 
74 


6  IOWA  STUDIES  IN  CHILD  WELFARE 

Chapter  Page 

PART  THREE.    A  NOTE  REGARDING  THE  PSYCHOPHYS- 
ICAL ANALYSIS  OF  PITCH  DISCRIMINATION  IN  YOUNG 

CHILDREN 
Harold  M.  Williams  and  Mklvin  S.  Hattwick 
Determination  of  Group  Thresholds         ......  77 

Fit  of  Group  Data  and  Normal  Probability  Integral     ...  79 

Distribution  of  Individual  Thresholds     ......  79 

References         ...........  84 

PART  FOUR.     IMMEDIATE  AND  DELAYED  MEMORY  OF 
PRESCHOOL   CHILDREN   FOR   PITCH   IN   TONAL 
SEQUENCES 
Harold  M.  Williams 
Experimental   Conditions  ........  87 

Method    of   Scoring 88 

Learning   Curves      ..........  89 

Correlations  Between  Immediate  and  Delayed  Recall     ...  92 

Conclusions       ...........  93 

References         ...........  94 

APPENDIX 

Exaggeration  of  Intervals        .         .         .         .         .         .         .         .  97 

Procedure  and  Instructions  for  the  Second  Interval  Singing  Test  97 

Typical  Procedure  and  Instructions  used  in  Group  Testing  .         .  98 

A  Recorded  Test  of  Pitch  Discrimination  for  Young  Children     .  99 


PART  ONE 

A  GENETIC  STUDY  OF  DIFFERENTIAL 
PITCH  SENSITIVITY  * 

BY 

Melvin  S.  Hattwick 


This  study  was  directed  by  Dr.   Harold  M.  Williams. 


CHAPTER  I 

GENERAL  ORIENTATION  AND  SCOPE  OF  THE 

INVESTIGATION 

GENERAL  ORIENTATION 

The  purpose  of  this  investigation  was  to  study  certain  aspects  of 
pitch  sensitivity  in  young  children,  particularly  the  problem  of 
differential  responses  to  small  intervals. 

At  its  lowest  level  pitch  sensitivity  has  been  shown  to  be  ex- 
pressed by  changes  in  a  generalized  type  of  body  activity  (20)  ;  at 
its  highest  level  it  involves  an  integration  of  sensory  with  cognitive 
processes,  measurable  by  means  of  verbal  responses,  as  in  pitch  dis- 
crimination. Between  these  two  extremes  development  is  taking 
place  in  the  sensory  and  cognitive  processes  with  evidence  support- 
ing the  view  that  the  former  is  completed  earlier  than  the  latter. 
In  a  study  of  differential  pitch  sensitivity  one  must  consider  (1) 
the  possibility  of  finding  a  useful  method  of  measurement  which 
does  not  necessitate  the  use  of  complex  conceptual  devices,  and 
(2)  the  possibility  of  revising  present  methods  using  the  verbal 
concept  so  that  they  may  be  simpler  and  easier  for  children  to  un- 
derstand. The  emphasis  in  the  present  study  is  on  the  latter. 

SCOPE  OF  THE  STUDY 

Possible  Methods  of  Investigation 

Of  the  possible  methods  for  measuring  differential  reactions  to 
pitch,  there  are  four  which  seem  to  be  most  promising.  These  might 
be  termed  the  interval  matching  method,  the  conditioned  response 
method,  the  singing  method,  and  the  verbal  concept  method. 

Interval  Matching  Method.  —  The  interval  matching  method,  sug- 
gested by  Williams  (23)  on  the  basis  of  a  teaching  device  of  Mon- 
tessori  (14),  consists  in  providing  the  child  with  a  set  of  bells  sim- 
ilar to  those  described  later  in  this  study  (p.  13).  The  child  is  in- 
structed to  strike  these  in  the  same  order  the  experimenter  used, 
on  the  analogy  of  color  matching.  The  natural  interest  of  such  a 
method  recommends  it  for  children,  but  certain  important  objec- 


10  IOWA  STUDIES  L\  CHILD  WELFARE 

tions  are  met  in  the  mechanics  of  the  test.  For  example,  unless  the 
child  strikes  the  first  of  the  two  tones  correctly  each  time,  he  may 
become  confused  by  hearing  a  tone  he  did  not  expect.  The  time 
required  for  the  child  to  respond  by  striking  the  bells  after  the 
test  presentation  is  another  disadvantage.  Still  another  is  the  in- 
ability of  the  child  to  strike  the  bells  at  the  approximate  intensities 
of  the  test  interval,  a  fact  which  would  undoubtedlv  tend  to  con- 
fuse  him  as  the  intervals  became  smaller.  This  method  was  rejected, 
therefore,  for  the  present  study. 

Condiitioned  Response  Method.  —  The  conditioned  response 
method  has  one  distinct  advantage  as  a  measure  of  pitch  discrim- 
ination in  that  it  might  permit  one  to  establish  a  threshold  ac- 
curately for  very  young  children  or  even  infants.  Probably  the 
greatest  disadvantage  lies  in  the  prohibitive  amount  of  time  neces- 
sary. Furthermore,  it  would  not  yield  a  limen  comparable  with 
procedures  already  used  with  adults.  Finally,  it  might  lead  to 
nervous  indications  when  very  small  differences  are  used,  such  as 
Pavlov  (15)  found  in  working  with  dogs.  This  method  was  also 
rejected  for  use  in  the  present  study. 

Singing  Method.  —  In  the  singing  method  the  child  listens  to  sim- 
ple intervals  under  the  instruction  to  sing  what  he  hears.  The 
essential  requirements  of  the  method  are  that  the  child  understand 
instructions  and  have  a  certain  degree  of  voco-motor  control.  Two 
criteria  are  possible :  singing  the  intervals  accurately  and  singing 
them  directionally  (that  is,  following  the  pitch  change  in  direction 
if  not  in  absolute  amount).  Since  the  instructions  are  simple,  no 
serious  difficulty  may  be  expected  from  this  source.  The  second 
requirement,  however,  presents  a  more  serious  problem  since  a 
lack  of  voco-motor  control  necessarily  results  in  failure  of  the 
method.  To  the  extent  that  children  tend  to  imitate  exactly  the 
pitch  of  the  tones  a  demand  is  made  on  the  discriminative  ability 
of  the  experimenter.  The  advantages  of  the  method  may  be  listed 
as  (1)  simplicity  of  the  instructions,  (2)  rapidity  of  the  test,  and 
(3)  the  apparent  early  spontaneous  appearance  of  the  ability  to 
imitate  pitch. 

Verbal  Concept  Method.  —  In  this  method  two  tones  are  pre- 
sented to  the  subject,  who  is  required  to  respond  verbally  with 
some  sign  such  as  "The  last  is  higher."  This  method  imposes  cer- 
tain requirements  on  the  subject,  the  most  important  of  which  is 
that  he  understand  the  concept  employed.   The  "same-different" 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        11 

terminology  used  by  Gilbert  (5)  and  Meissner  (12)  in  experi- 
ments with  children  must  be  considered  unsatisfactory  because  of 
the  possibility  of  judging  the  two  tones  on  the  ba>iis  of  intensity  ^ 
rather  than  pitch.  The  terms  "higher-lower"  of  the  Seashore  test, 
if  properly  comprehended,  are  clear.  A  point  which  may  be  raised 
against  this  terminology,  however,  is  the  possibility  of  confusion  as 
to  which  of  the  two  tones  is  to  be  judged  in  terms  of  the  other.  In 
addition,  one  must  make  certain  that  the  terms  "high"  and  "low" 
have  been  properly  associated  in  the  pitch  continuum. 

Williams  (23)  has  suggested  a  terminology  of  "going  upstairs, 
going  downstairs"  which,  after  preliminary  investigation  by  the 
writer,  has  been  shortened  to  ' '  going  up  —  going  down. ' '  From  the 
standpoint  of  logic  and  simplicity  in  comprehension  this  terminol- 
ogy has  certain  advantages  for  use  with  children.  The  phrases 
"going  up  —  going  down"  are  simple  enough,  at  least  in  the  visual 
field,  to  be  understood  by  practically  every  normal  child  from 
five  years  on.-  The  word  "going"  suggests  a  directional  idea  each 
time  and  helps  the  child  realize  that  he  should  judge  the  last  tone 
in  relation  to  the  first  without  the  necessity  of  repeating  this  dis- 
tinction. 

On  a  logical  basis,  the  greatest  difficulties  of  the  concept  method 
have  to  do  with  the  comprehension  of  terms.  These  difficulties  are 
counterbalanced  to  a  large  extent  by  such  advantages  as  (1)  quick- 
ness of  test  procedure;  (2)  the  possibility  of  establishing  a  clear 
criterion  for  judging;  (3)  the  fact  that  this  method,  if  perfected, 
gives  the  commonest  conception  of  a  limen  in  this  field;  and  (4)  the 
general  musical  validity  of  such  a  technique. 

Sumrruiry.  —  From  the  survey  of  possible  methods,  those  of  sing- 
ing and  verbal  concept  seem  to  be  the  least  objectionable  and  to 
show  the  most  possibilities.  These  two,  therefore,  have  been  chosen 
particularly  for  study.  The  experimental  data  to  be  presented  in 
this  research  may  be  listed  as  follows : 

With  regard  to  the  singing  procedure 

1.  Degree  of  resistance  to  singing  at  various  age  levels  under  different 

conditions  of  motivation 

2.  Percentage  of  children  at  various  age  levels  able  to  reproduce  inter- 

vals directionally 

3.  Effect  of  practice  on  the  responses 

4.  Comparison   of  results   from  large  and   small  intervals 

1  It  is  practically  impossible  to  insure  equal  intensity  for  everj'  tone  of  every  pair, 
although  practice  may  minimize  the  difficulty.  In  the  Seashore  record  the  intensities  were 
intentionally   varied   in    a   random   way. 

2  This  statement  will  be  shown  to  be  true  under  the  section   on   page  39  in  Chapter  IV. 


12  IOWA  STUDIES  IN  CHILD  WELFARE 

With  regard  to  the  concept  procedure 

1.  Effect  of  factors  such  as  concept  terminology,  pace  of  test,  type  of 

recording  on  group  and  individual  test  scores 

2.  Effect  of  practice  on  the  responses 

3.  Percentage  of  children  testable   at   various  age  levels  by   group   and 

individual  methods 

In  addition,  a  standard  procedure  for  group  testing  has  been 
evaluated  and  tentative  norms  derived.  Finally,  a  descriptive  nor- 
mative summary  is  given  based  on  all  the  results  of  experiments  to 
date. 

Sources  of  Data 

The  children  serving  as  subjects  in  this  study  were  enrolled  in 
the  following  schools:  (1)  the  four-  and  five-j^ear-old  groups  of  the 
preschool  laboratories  at  the  Iowa  Child  Welfare  Research  Station 
at  Iowa  City,  Iowa;  (2)  the  first  and  second  grades  of  the  Univer- 
sity Elementary  School  at  Iowa  City;  (3)  the  second  to  fifth  grades 
inclusive  of  St.  Patrick's  Parochial  School  at  Iowa  City;  and  (4) 
the  first  to  fifth  grades  inclusive  from  twenty-six  elementary 
schools  in  the  Des  Moines  Public  School  System  at  Des  Moines, 
Iowa. 

The  total  number  of  children  on  whom  experimental  data  are 
presented  in  the  following  sections  is  3,902.  The  age  range  of  these 
children  was  from  three  years,  six  months  to  twelve  years,  four 
months. 

Apparatus 

One  of  the  major  difficulties  in  testing  the  pitch  discrimination  of 
children  lies  in  the  fact  that  a  simple,  reliable,  and  interesting 
(from  the  child's  standpoint)  means  of  producing  tones  is  difficult 
to  find.  Up  to  the  time  of  this  investigation  the  sound  sources  for 
pitch  tests  have  been  tuning  forks,  variable  pitch  pipes,  and  Vic- 
trola  records  such  as  the  test  of  pitch  discrimination  in  the  Sea- 
shore series  (recorded  by  the  Columbia  phonograph  company). 
McGinnis  (11)  has  reported  unfavorably  on  the  use  of  the  Sea- 
shore records,  except  with  modifications,  for  testing  pitch  dis- 
crimination of  children.  Seashore  (18),  Allen  (1),  and  Halverson 
(reported  by  Town  (21))  have  used  the  tuning  fork  technique 
with  children,  the  first  two  investigators  having  had  the  children 
report  by  concept  and  the  latter  by  singing.  All  three  attribute 
their  difficulties  more  to  the  method  of  testing  than  to  any  inade- 
quacies of  sound  production.  Gilbert  (5)  and  Meissner  (12)  used  a 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        13 

variable  pitch  pipe  as  a  source  of  sound.  In  a  recent  study  Hissem 
(8)  reports  having  used  scientifically  tuned  chimes  or  gongs  in 
testing  children's  ability  to  discriminate  simple  tones  or  rhythms. 

The  sound-producing  apparatus  for  the  present  investigation 
consisted  of  a  vibrating  bar  hung  over  a  resonator,  which  was 
tuned  to  the  bar.  By  striking  the  bar  with  a  small  rubber  hammer 
a  pure  tone  is  produced.  The  bars  were  made  by  the  Deagan  com- 
pany and  are  similar  to  those  found  on  dinner  gongs.  They  were 
tuned  to  the  appropriate  pitches  ^  by  the  investigator  for  the  test- 
ing series. 

The  resonators  were  made  from  ordinary  one  and  one-half  inch 
brass  tubing.  One  end  was  sealed  by  a  brass  cap  and  a  three-fourths 
inch  hole  was  cut  from  the  side  of  the  tube  halfway  from  the  ends. 
The  other  end  was  closed  by  means  of  a  cork'  with  a  smooth  inner 
surface  of  cardboard  which  would  reflect  well.  By  means  of  the 
adjustable  cork  the  length  of  each  resonator  could  be  adjusted  so 
that  each  would  "speak"  at  a  maximum  when  the  bar  was  struck. 
The  general  construction  was  identical  with  the  usual  dinner  gong. 
The  bells  could  be  easily  handled  by  the  experimenter  at  the  same 
time  that  responses  from  the  subject  were  being  recorded.  Exten- 
sive practice  in  striking  and  damping  the  tones  reduced  to  a  min- 
imum the  possibilities  of  any  large  differences  in  the  length  or 
intensity  of  the  sounds. 


3  The  beat  method  was  used  in  tuning  the  bars.  Reliability  between  three  observers  in 
counting  beats  varied  from  .961  ±  .002  to  .923  ±  .006  for  various  rates  of  beating  tones. 
Oscillographic  record.s  were  taken  of  the  tone  produced  when  the  442  d.  v.  bar  was  struck 
medium  loud  (as  used  in  testing)  and  very  loud.  Harmonic  analysis  (Henrici  analyzer) 
showed  that  in  the  medium  loud  tone  98.6  per  cent  of  the  total  energy  was  contained  in 
the  fundamental  component,  while  in  the  verj'  loud  tone  98.4  per  cent  of  the  total  energy 
was  found  in  the  fundamental.  The  remaining  energy,  1.4  per  cent  and  1.6  per  cent  re- 
spectively, was  distributed  over  the  remaining  nine  partials  in  very  small  proportions. 
(Each  wave  was  analyzed  through  the  first  ten  partials.) 


CHAPTER  II 

EXPERIMENTS  IN  METHODOLOGY:  THE 
SINGING  TECHNIQUE 

HISTORICAL  SUMMARY 

Seashore  (19)  has  suggested  that  pitch  differences  are  exaggerat- 
ed when  reproduced  in  singing.  Miles  (13),  and  later  Williams 
(22),  confirmed  this  statement  with  respect  to  adults.  Halverson 
(reported  by  Town  (21)),  however,  was  the  first  and,  until  the 
present  investigation,  the  only  one  to  utilize  the  suggestion  in  a 
test  with  children. 

Halverson  asked  fift}- -three  children,  ages  five  to  seven  inclusive, 
to  sing  the  sounds  they  heard  (tuning  forks  placed  before  resona- 
tors being  used  as  sound  sources).  "The  frequently  occurring  ex- 
aggerations of  interval  by  the  children,  if  in  the  right  direction, 
were  taken  to  mean  that  the  children  really  appreciated  the  dif- 
ference in  pitch.  The  least  difference  in  pitch  that  the  child  dis- 
tinguished with  certainty  was  recorded  as  the  score."  ( (21),  p.  18) 
He  concluded  that  "the  variations  in  the  findings  are  so  great  that 
the  group  norm  is  useless.  The  scores  vary  from  2  to  17  double 
vibration  differences,  with  13  total  failures."  ( (21),  p.  18-19) 

The  number  of  failures  (25  per  cent)  and  the  variability  in  the 
scores,  however,  may  have  been  due  to  a  variety  of  factors.  These 
might  have  included  failure  to  exaggerate  the  differences  enough 
(which  in  very  small  interval  differences  could  easily  be  confused 
with  failure),  inattention,  disinterest,  or  fatigue,  since  it  cannot 
be  determined  from  the  published  report  how  many  trials  were 
given  each  child. 

In  general,  errors  due  to  subjective  estimation  of  responses  when 
using  the  singing  technique  almost  always  yield  results  that  are 
unfair  to  the  child.  Nevertheless,  since  the  instructions  in  the 
method  are  easily  understood  at  young  age  levels  and  since  re- 
sponses are  apparently  readily  given  in  most  cases,  the  technique 
can  hardly  be  dismissed  on  the  adverse  but  inconclusive  evidence 
presented  by  Halverson. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        15 

PROBLEMS  IN  THE  EVALUATION  OF  THE  INTERVAL 

SINGING  TECHNIQUE 

It  has  seemed  worth  while  to  investigate  the  singing  method  more 
thoroughly.  The  following  questions  have  accordingly  been  given 
experimental  consideration  in  this  section:  (1)  To  what  extent  is 
there  emotional  resistance  to  singing?  (2)  What  per  cent  of  those 
responding  satisfactorily  can  reproduce  the  intervals  in  the  correct 
direction?  (3)  Is  exaggeration  of  intervals,  if  found,  great  enough 
to  permit  reliable  subjective  estimates  of  direction  as  such  inter- 
vals grow  smaller?  (4)  What  are  the  effects  of  minimum  practice  on 
success  ? 

Since  the  singing  technique  Avould  seem  to  have  its  greatest  use- 
fulness at  the  lower  age  levels,  the  presentation  of  data  in  this 
section  follows  the  order  of  increasing  age. 

THE  MUSICAL  LABORATORY 

To  provide  adequate  motivation  in  this  experiment,  the  investi- 
gator arranged  a  musical  laboratory  for  younger  children  in  which 
he  had  a  number  of  musical  instruments  including  a  xylophone,  a 
number  of  organ  pipes  both  wood  and  metal,  Chinese  gongs, 
whistles,  bells,  and  so  forth. 

The  child  was  brought  into  this  laboratory  and  allowed  to  play 
with  the  musical  materials  while  the  experimenter  made  sugges- 
tions and  answered  questions.  At  an  appropriate  time  the  sugges- 
tion was  made  that  both  the  experimenter  and  the  subject  play  a 
game  with  the  bells  which  were  sitting  on  a  table.  After  a  prelim- 
inary introduction  to  the  test  in  this  manner,  the  child  was  taken 
away  with  the  promise  of  another  game  on  the  following  day.  The 
performance  of  the  children  on  these  preliminary  trials  was  never 
included  in  the  experimental  data. 

INTERVAL  SINGING  EXPERIMENTS 

Each  child  was  told  to  sit  quietly,  shut  his  eyes,  and  listen  to  the 
bells  and  as  soon  as  the  bells  stopped  to  sing  "ding-dong"  just  like 
the  bells.  The  experimenter  sang  several  trials  for  demonstration 
and  then  tested  the  child.* 

Eighteen  four-year-old  and  twenty-four  five-year-old  children 
(ages  four  years,  two  months  to  five  years,  eleven  months)   from 

*  Recording  was  done  by  the  experimenter  on  special  blanks  in  all  individual  testing 
throughout  the  study.  Aft*r  some  practice  he  became  quite  proficient  in  ringing  the  bells 
with  the  right  hand  and  recording  with  the  left. 


i 

Num- 
ber 

Age, 

Per 
Cent 

Years 

Num- 
ber 

5 
Per 
Cent 

Total 
Num-     Per 
ber      Cent 

16 
12 

75 

24 
17 

71 

40 

29        73 

3 

25 

12 

70 

15        52 

3 

1 

19 
6 

5 
2 

21 

8 

8        20 
3          8 

16  IOWA  STUDIES  IN  CHILD  WELFARE 

the  preschools  of  the  Iowa  Child  Welfare  Research  Station  served 
as  subjects  in  this  experiment. 

First  Main  Interval  Singing  Test 

In  this  test  twenty -five  trials  were  given.  The  intervals  used  (26, 
14,  and  8  d.  v.^)  were  chosen  merely  for  preliminary  orientation 
and  it  was  not  expected  that  they  would  really  approach  threshold 
value.  The  direction  of  the  intervals  on  each  trial  (whether  they 
should  be  "going  up"  or  "going  down")  was  determined  original- 
ly by  random  choice  and  placed  on  a  key. 

The  results  of  this  test  are  found  in  the  following  tabulation : 

Children 

Tested 

Responded  for  twenty-five   trials 

Singing  direction  correctly  in  90  per 

cent  of  trials 
Refusing    to    sing    but    accepting    a 

verbal  method  of  response 
Refusing  to  sing  or  respond  verbally 

These  results  are  very  similar  to  the  results  found  in  a  shorter 
test  in  which  but  a  single  large  interval  wa.s  used.  The  proportion 
of  success  in  directional  singing  was  small.  It  seems  fairly  conclu- 
sive, therefore,  that  voco-motor  control  presents  the  fundamental 
difficulty  of  this  technique  at  those  ages. 

Since  one-fourth  of  the  children  refused  to  sing  in  this  experi- 
ment, it  was  impossible  to  determine  what  percentage  of  the  total 
group  could  sing.  The  following  procedure  was  undertaken,  there- 
fore, in  an  effort  to  secure  responses  from  every  child. 

Ediphane  Test  of  Interval  Singing 

This  experiment  was  carried  out  (1)  in  an  attempt  to  add  more 
motivation,  (2)  to  determine  whether  the  children  sang  relatively 
accurately  or  merely  directionally,  and  (3)  to  determine  whether  the 
intervals  sung  were  exaggerated  to  a  perceptible  extent. 

The  experimenter  sang  "ding-ding"  into  the  Ediphone  horn  and 
instructed  the  child  to  sing  just  as  the  experimenter  had.  The  child 
was  promised  that  he  would  be  allowed  to  hear  later  how  closely 
his  voice  sounded  like  that  of  the  experimenter. 

A  flexible  procedure  was  used  in  this  experiment,  the  basic  prob- 
lem being  to  determine  what  proportion  of  the  children  could  sing, 


5  The  abbreviation  d.  v.  refers  to  double  vibrations  throughout  the  study. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        17 

either  on  pitch  or  directionally,  with  this  added  motivation  and  a 
reasonably  objective  record  of  response.  While  the  dictaphone  is 
not  sufficiently  accurate  to  yield  precise  values  by  phonophotog- 
raphy,  it  is  accurate  enough  to  give  information  regarding  tenden- 
cies of  exaggeration.  The  accuracy  of  the  experimenter's  ear  has 
been  previously  reported  (7).  Root  (17)  has  demonstrated  the  ac- 
curacy of  the  method  of  repeated  hearings  of  a  recorded  stimulus. 

The  stimuli  were  vocal  major  thirds  and  seconds.  The  exact  num- 
ber of  trials  per  child  was  determined  as  follows:  If  a  child  sang 
at  least  nine  out  of  the  first  ten  trials  in  the  same  direction  as  the 
test  trials  sung  by  the  experimenter,  there  seemed  to  be  little  doubt 
that  he  was  testable.  If  he  failed  on  more  than  one  trial,  the  test 
was  made  at  least  twice  as  long  (generally  twenty-five  trials)  in  or- 
der to  determine  a  more  accurate  percentage  of  success. 

One  hundred  twenty-six  children  from  the  preschools  of  the 
Iowa  Child  Welfare  Research  Station  and  the  University  Elemen- 
tary School  of  the  State  University  of  Iowa  were  used  as  subjects. 
These  children  did  not  include  any  of  the  group  tested  in  the  prev- 
ious experiments.  The  age  range  was  from  three  years,  six  months 
to  eight  years,  eight  months. 

The  Ediphone  proved  an  excellent  motivating  device ;  every  child 
would  sing  because  each  was  promised  that  he  would  be  allowed 
later  to  hear  if  his  voice  sounded  like  the  experimenter's.  Table  1 
and  Figures  1  and  2  show  the  results.  There  is  a  progressive  in- 
crease in  voco-motor  control  with  an  increase  in  mental  and  chron- 


TABLE   1 
Relationship  of  Voeo-Motor  Control  to  Chronological  and  Mental  Ages 


Per  Cent 

Per  Cent 

Per   Cent 

Singing    Ac- 

Singing Only 

Per  Cent 

Age, 

Chil- 

Who Would 

curately    (90 

Directionally 

Exaggerat- 

Years 

dren 

Sing 

Per    Cent 

(90  Per  Cent 

ing  Inter- 

Success) 

Success) 

vals 

Chronolo^ 

fical  Age 

4 

21 

100 

0 

5 

5 

5 

26 

100 

13 

23 

19 

6 

21 

100 

14 

48 

48 

7 

35 

100 

38 

47 

47 

8 

23 

100 

26 

44 

44 

Menta 

1  Age 

4 

9 

100 

0 

9 

9 

5 

10 

100 

0 

10 

10 

6 

23 

100 

4 

13 

10 

7 

17 

100 

5 

36 

36 

8 

24 

100 

42 

42  • 

42 

9 

27 

100 

37 

45 

45 

10 

16 

100 

29 

64 

64 

18 


IOWA  STUDIES  IN  CHILD  WELFARE 


l^v^ 

r— 1 

80 

fU 

/ 

ytr\ 

/ 

t 
/ 

/ 
/ 

or* 

/ 
/ 
/ 
/ 

N 
> 

H  in 

/       y 



/ 

/ 

/ 

HAL  Reproduction 
=  Reproduction 

UJ 

U 

111  r\ 

/         y 
/        y 

/ 

y 
y 

^ Accurate 

a  o 

C 

4                           5                          6                         7© 

Chronological   Age 

Figure  1.  Kclatioiiship  of  Voco-Motor  Control  to  Clironological  Age 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT         19 


100 

90 

on 

— 

— 

80 
in 

/ 

fU 

/ 

■ 

/ 

40 
30 

J 

\ 

/ 
/ 

\ 

\ 

/ 

N 

20 

z 

kU 

U 

/ 
/ 

/ 

1  - 

/ 

DiitEaiONikL  RtPRooucnoN 



/ 

ACCURME  RtpROOUCTION 

iU 

■ 

1 

^              5 

€> 

T              8              9             »0 

Mental  Age 

Figure  2.  Kelationship  of  Voco-Motor  Control  to  Mental  Age 


20  IOWA  STUDIES  IN  CHILD  WELFARE 

ological  age.  The  correlations  of  mental  age  and  chronological  age 
with  ability  to  reproduce  simple  intervals  directionally  are  as  fol- 
lows: 

Age  r  PE  Eta  PE  , 

f  r  eta 

Chronological  .505  .052  .612  .043 

Mental  .424  .056  .541  .048 

These  correlations  are  low  but  significant.  They  suggest  that  some 
combination  of  general  maturity  and  possibly  understanding  of 
directions  or  cooperation  is  at  work. 

From  Table  1  it  may  be  seen  also  that  those  who  sang  the  inter- 
vals in  the  correct  direction  almost  always  exaggerated  such  inter- 
vals. This  is  in  agreement  with  previous  experimental  data  on 
adults.  However,  a  considerable  number  sang  accurately  and  hence 
showed  no  exaggeration.  Since  there  were  so  few  successes  on  either 
criterion  at  these  early  ages,  the  data  on  this  point  are  inconclu- 
sive. Failure  to  exaggerate  in  some  cases  may  be  due  to  inadequate 
voco-motor  control  at  this  level.  At  the  later  ages,  such  failure  is 
due,  in  part,  to  increased  accuracy  in  singing  inter^^als  on  pitch. 

Actual  pitch  discrimination  tests  given  to  twenty-two  children 
who  could  and  would  sing  what  they  heard  give  additional  but  still 
inadequate  information  on  exaggeration  of  intervals.  In  these  tests 
the  majority  of  the  children  did  not  exaggerate  the  intervals  of 
the  test.  At  the  lower  intervals  (3  and  2  d.  v.)  the  test  was  stopped 
because  the  experimenter  could  not  rely .  on  his  subjective  esti- 
mates. (See  Appendix,  p.  97)  From  these  data  it  is  evident  that  the 
singing  method  is  limited  in  its  use,  especially  in  measuring  dis- 
crimination of  fine  intervals. 

Second  Main  Interval  Singing  Test 

This  test  was  an  attempt  to  try  out  the  singing  method  in  a  school 
whose  pupils  might  more  nearly  represent  the  average  school  pop- 
ulation in  mental  ability  and  socio-economic  status  than  did  the 
subjects  previously  used.  The  problem  here,  as  in  the  previous 
interval  singing  test,  was  to  determine  what  per  cent  of  children 
in  a  typical  school  responded  at  least  directionally  to  simple  inter- 
vals. In  addition,  there  was  an  attempt  to  determine  whether  any 
improvement  would  occur  after  a  very  short  training  period. 

In  this  second  interval  singing  test  eighty  children  from  the  sec- 
ond to  the  fifth  grades  inclusive  served  as  subjects.*'  The  age  range 


6  The  children  referred  to  here  were  enrolled  in  St.  Patrick's  parochial  school  at  Iowa 
City,  Iowa.  The  writer  wishes  to  thank  the  authorities  of  the  school  for  their  cooperation 
in  the  testing  program. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        21 


was  from  six  years,  eight  months  to  twelve  years,  five  months.  The 
bells  described  earlier  were  used  as  sound  sources.  Instructions  and 
procedures  were  similar  to  those  used  in  the  first  main  interval  sing- 
ing experiment. 

TABLE   2 

Per  Cent  of  Successes  in  Singing  a  Simple  Interval  in  Correct  Direction 
Before  and  After  Minimum  Practice 


Per  Cent 

Per   Cent 

Per  Cent 

Per  Cent 

Succeeding 

Succeeding 

Improve- 

Age, 

Chil- 

Who Would 

Direetionally 

Direetionally 

ment  With 

Years 

dren 

Sing 

Before 

After 

Minimum 

Training  * 

Training  * 

Training 

7 

8 

100 

63                     63 

0 

8 

15 

100 

33                     40 

10 

9 

18 

100 

44                     56 

20 

10 

27 

100 

19                     41 

27 

11 

12 

100 

17                     17 

0 

*  90  per  cent  success. 

Table  2  gives  the  results  of  the  experiment.  The  percentages  of 
success  in  directional  singing  are  fairly  comparable,  for  similar 
chronological  ages,  to  those  found  in  the  previous  school  system. 
The  frequency  of  cooperation,  however,  fell  off  markedly  at  the 
older  ages.  There  was  obvious  reticence  shown  by  many  of  the 
children  when  asked  to  sing. 

The  percentage  of  improvement  with  minimum  training  which 
is  recorded  here  may  have  been  due  entirely  to  the  breaking  down 
of  reticence  in  singing.  Whatever  the  cause,  it  may  be  seen  that  a 
short  training  period  does  have  some  slight  effect,  although  the 
total  per  cent  of  children  testable  even  after  training  is  less  than 
50  per  cent  at  the  fifth  grade  level. 

It  is  as  important  to  recognize  the  limitations  of  the  singing 
approach  for  the  higher  levels  as  it  is  to  recognize  the  limitations 
of  the  verbal  concept  approach  for  the  lower  levels.  The  present 
experiment,  although  very  limited  in  scope,  suggests  the  inadvisa- 
bility  of  using  the  singing  technique  above  the  first  few  school 
grades. 

SUMMARY  OF  INTERVAL  SINGING  AS  A  TECHNIQUE  FOR 
MEASURING  PITCH  DISCRIMINATION 

The  interval  singing  technique  for  testing  pitch  discrimination 
provides  an  interesting  situation  for  the  young  child,  uses  instruc- 
tions which  are  easily  understood,  and  requires  no  great  effort  in 
response  (appearing  almost  reflexive  when  present). 


22  IOWA  STUDIES  IN  CHILD  WELFARE 

Results  from  experiments  on  voeo-motor  control,  however,  indi- 
cate three  major  weaknesses  in  the  method  which  seriously  hand- 
icap its  uses  as  a  measuring  instrument:  (1)  Not  all  children  can 
reproduce  a  stimulus  directionally.  (2)  Many  sing  quite  accu- 
rately and  do  not  exaggerate  intervals  to  an  observable  extent. 
(3)  A  considerable  amount  of  emotional  resistance  is  encountered. 
The  method  is  valuable  largely  as  a  supplementary  measure  as,  for 
example,  in  cases  of  early  voeo-motor  control  or  of  late  cognitive 
development. 


CHAPTER  III 

EXPERIMENTS  IN  METHODOLOGY:  THE 
VERBAL  CONCEPT  TECHNIQUE 

The  methods  employed  in  testing  discriminative  ability  of  adults 
and  older  children  have  almost  uniformly  incorporated  some  type 
of  verbal  response.  At  the  younger  ages,  however,  where  mental 
growth  and  experience  are  at  less  advanced  stages,  one  might 
expect  to  encounter  difficulties  (3,  6,  16).  Differences  in  cognitive 
discriminative  ability  may  be  due  to  any  of  the  following  reasons : 
(1)  difficulty  in  the  terminologies  themselves,  (2)  differences  in 
difficulty  of  the  same  terminology  when  used  in  different  sense 
fields,  and  (3)  complex  instructional  situations  which  are  too  in- 
volved for  the  child  to  grasp,  despite  the  fact  that  he  may  under- 
stand the  meaning  of  the  terminology  used.  Factors  other  than 
difficulties  in  terminology  —  pace,  motivation,  methods  of  record- 
ing—  may  also  influence  the  child's  behavior  in  a  test  situation 
in  which  the  verbal  concept  method  is  used.  All  of  these  factors 
must  be  taken  into  consideration  since  the  basic  essential  in  a  test 
of  pitch  discrimination  is  that  the  child  make  judgments  on  the 
basis  of  pitch  differences  alone. 

The  factors  chosen  for  study  in  group  and  individual  situations 
in  the  present  experiment,  of  course,  do  not  exhaust  the  list  of 
possible  influences.  Instead  those  of  major  importance  were  chosen 
for  investigation.  Tests  were  given  to  comparable  groups  begin- 
ning with  the  fifth  grade  and  proceeding  downward  to  a  point 
w^here  profitable  returns  from  group  tests  ceased.  To  supplement 
the  group  tests,  individual  tests  were  introduced.  The  general  ar- 
rangement of  this  chapter,  therefore,  is  downward  with  respect  to 
age. 

SUBJECTS  AND  METHOD 

Group  tests  were  given  to  2,957  children  from  twenty-three  ele- 
mentary schools  in  the  Des  Moines  public  school  system.  Individual 
tests  were  given  to  forty-eight  children  from  the  elementary  schools 
of  Des  Moines,  eighty  children  from  St.  Patrick's  school  in  Iowa 


24  IOWA  STUDIES  IN  CHILD  WELFARE 

City,  and  126  children  from  the  Iowa  Child  Welfare  Research 
Station  preschools  and  the  University  Elementary  School. 

In  all  group  tests  fifty  trials  were  given  at  one  sitting.  Each 
test  consisted  of  introdiictor}^  remarks  and  instructions  with  a 
short  practice  period.  A  short  pause  was  made  and  instructions 
were  repeated  after  each  ten  trials.  Full  typical  instructions  (for 
both  the  "going  up  —  going  down"  and  the  "higher-lower"  ter- 
minologies used  in  the  following  comparisons)  are  given  in  the 
Appendix  (p.  98). 

In  individual  testing  the  situation  was  less  formal,  and  the  ex- 
perimenter added  words  of  encouragement  when  they  were  appar- 
ently needed.  The  number  of  test  trials  per  sitting  varied  with  the 
individual,  as  few  as  ten  being  given  in  some  cases  and  sixty  in 
others.  Those  who  responded  to  less  than  twenty-five  trials  at  one 
sitting  were  brought  back  later  for  more  testing.  The  game  atmo- 
sphere was  always  present.  If  children  asked  questions  about  their 
performance,  they  were  answered  in  an  encouraging  way ;  this  sit- 
uation frequently  occurred. 

By  using  short  tests,  breaking  longer  tests  into  short  tests,  and 
making  the  test  a  game,  a  consistent  attempt  was  made  to  intro- 
duce motivation.  In  some  cases  there  appeared  to  be  overmotivation 
rather  than  undermotivation. 

GROUP  TESTING 

Evaluation  of  Two  Concept  Terminologies 

It  was  not  possible  in  the  present  experiment  to  test  out  every 
type  of  verbal  concept  terminology.  Two,  however,  seemed  superior 
to  the  others.  These  were  the  "higher-lower"  terminology  used  in 
the  Seashore  pitch  discrimination  test  and  a  ' '  going  up  —  going 
down"  terminology.  The  latter  terminology  had  proved  itself 
useful  and  comparatively  easy  to  understand  in  preliminary  indi- 
vidual tests  with  younger  children ;  the  former  had  been  used  in 
group  tests  as  low  as  the  third  grade  (2). 

The  relative  effectiveness  of  each  terminology  was  tested  by  the 
use  of  comparable  groups.  The  method  of  securing  and  testing 
these  groups  was  as  follows:  Group  A  consisted  of  those  children 
seated  at  the  odd  numbered  desks,  while  Group  B  consisted  of  those 
seated  at  the  even  numbered  desks."  Group  B  children  were  taken 
to  a  nearby  classroom  w  hile   the   test   with   ' '  going   up  —  going 


V  This  method  of  selecting  comparable  groups  was  used   in  all   group  testing. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT 


25 


down"  terminology  was  given  to  Group  A.  At  the  close  of  the  first 
test  the  two  groups  exchanged  places  and  the  test  was  repeated 
but  using  the  ''higher-lower"  terminolog3\ 

The  first  four  intervals  on  the  Seashore  test  record  of  pitch  dis- 
crimination were  employed  as  stimuli  in  this  comparison.  Ten 
trials  each  were  given  on  the  following  intervals  in  the  order 
named :  30,  23,  17,  12,  and  30  d.  v.  These  intervals  were  the  easiest 
in  the  test  and  were  used  because  they  offered  the  minimum  of 
discrimination  difficulty.  All  recording  was  done  by  using  Form  A 
described  on  page  32. 

A  total  of  1,202  children  from  the  second,  third,  fourth,  and  fifth 
grades  of  eight  elementary  schools  of  the  Des  Moines  public  school 
system  was  used  in  this  experiment. 

Tables  3  and  4  give  the  results  of  the  comparison.  On  the  whole, 

TABLE  3 

Per   Cent    Meeting   Criterion    With    Two    Terminologies    (Going    Up  —  Going 

Down  and  Higher-Lower)  With  the  Seashore  Pitch  Record 


Children 
Tested 

Per 

Cent       1 

90   Per   Cent 

Grade 

Completing 
Test  Blank 

Successes  on 
Two  Easiest  Intervals  * 

gUp- 
g  Down 

gUp- 
g  Down 

Going    Up — 
Going  Down 

Higher- 
Lower 

Standard 

Standard 

.g.a 

"Sd^ 

.S.S 

bc& 

Per 

Error  of 

Per 

Error  of 

o  o 

a^ 

oo 

K^ 

Cent 

Per  Cent 

Cent 

Per  Cent 

V 

157           153 

98 

97 

73           .035           70         .035 

IV 

140           152 

95 

96 

67           .040           66         .039 

III 

166           163 

98 

99 

39           .037           44         .039 

II 

102           169 

87 

73 

24           .042           21         .032 

*  Twenty  trials  on  30  d.  v.  and  ten  trials  on  23  d.  v.  intervals. 

TABLE  4 

Mean  and  Variability  of  Scores  for   Two   Terminologies    (Going  Up 
Down  and  Higher-Lower)  With  the  Seashore  Pitch  Eecord 


Going 


Grade 

Children 

Mean 

Standard  De- 
viation of  Dis- 
tribution 

Standard 
De\'iation 
of  Mean 

Going  Up  —  Going  Down 

V 
IV 
III 

II 

156 
145 
150 

97 

42.62                        10.26 
40.71                        10.89 
35.38                        10.92 
31.66                        10.59 

0.82 
0.91 
0.89 

1.08 

Higher-Lower 

V 
IV 
III 

II 

142 
143 
149 

152 

42.07                          9.96 
37.52                        12.99 
35.48                        12.18 
33.25                        10.05 

0.84 
1.08 
1.00 
1.03 

26  IOWA  STUDIES  IN  CHILD  WELFARE 

these  results  show  no  real  differences  between  the  terminologies.  The 
mean  scores  made  by  each  grade  are  not  significantly  different 
when  using  either  terminology,  as  shown  in  the  following  tabula- 
tion:  (This  is  computed  from  Table  4.) 

Ratio  of  Difference 
Groups  Compared  to  Standard  Error 

of  Difference* 
Grade  V   (UD)**  and  Grade  V   (HL)  0.47 

Grade  IV  (UD)  and  Grade  IV  (HL)  2.26 

Grade  III  (UD)  and  Grade  III  (HL^  0.07 

Grade  II  (UD)  and  Grade  II  (HL)  1.07 

*  For  complete  reliability,  ratio  of  difference  to  standard  error  of  difference  must 
equal  3.00  or  more. 

**  UD   mean.s    "going   up  —  going   down." 
HL  means   "higher-lower." 

Neither  do  the  percentages  who  scored  90  per  cent  correct  responses 
on  the  easiest  intervals  differ  significantly:  (This  is  computed  from 
Table  3.) 

Eatio  of  Difference 
Groups  Compared  to  Standard  Error 

of  Difference 
Grade  V  (UD)*  and  Grade  V  (HL)**  0.60 

Grade  IV  (UD)  and  Grade  IV  (HL)  0.18 

Grade  III  (UD)  and  Grade  III  (HL)  0.94 

Grade  II  (UD)  and  Grade  II   (HL)  0.71 

*  UD  means  "going  up  —  going  down." 
**  HL  means  "higher-lower." 

About  the  same  number  turned  in  completed  papers  in  both  in- 
stances, and  about  the  same  number  scored  90  per  cent  correct 
responses  on  the  easiest  intervals  by  each  method  (Figure  3). 

The  really  surprising  results,  however,  lie  in  the  fact  that  27  per 
cent  of  the  fifth  grade  children  scored  less  than  90  per  cent  correct 
responses  on  the  ea.siest  intervals,  intervals  which  approach  a  semi- 
tone in  musical  value.  It  seemed  hardly  possible  that  such  a  large 
percentage  was  failing  because  of  inability  to  hear  or  even  to  under- 
stand instructions.  Some  other  factor,  possibly  pace,  might  have 
been  operating  to  nullify  any  real  differences  in  comprehension 
which  might  exist  between  the  two  terminologies.  Hence  an  ex- 
periment was  arranged  in  order  (1)  to  evaluate  the  two  terminol- 
ogies when  the  factor  of  pace  was  controlled  and  (2)  to  determine 
the  importance  of  pace  itself. 

Experiment  for  Evaluation  of  Two  Concept  Terminologies  and  the 
Factor  of  Pace 

The  same  intervals  (30,  23,  17,  12,  and  30  d.v.)  were  used  as  in 
the  previous  experiment,  but  instead  of  using  only  the  Seashore 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        27 


100 
90 
80 
TO 

eo 

50 

nl 

/ 

// 
// 

/ 

/ / 

/       / 

30 

■ 

Z  in     - 

-UDL   TeRMINOLOGY 

u 

KL.  Terminoloov 

Q.O    t 

1 

J 

1 

1 

'                                       ,.] 

i                                                                    1 

— 1 

< 

n                   m                   nr                   IT 

5rade 

Figure  3.  Per  Cent  Meeting  Criterion  (90  Per  Cent  Success)   With  Two  Ter- 
minologies ("Going  Up  —  Going  Down"  and  "Higher-Lower") 
With  the  Seashore  Test 


28 


IOWA  STUDIES  IN  CHILD  WELFARE 


record  of  pitch  discrimination,  another  sound  source  "was  employed. 
This  was  the  bell  series  described  in  Chapter  I. 

The  bells  were  tuned  to  produce  exactly  the  same  intervals  as 
the  Seashore  record  and  were  rung  in  the  same  order.  The  only 
difference  between  the  two  tests  was  that  the  group  set  its  own 
pace  in  the  bell  test,  Avhile  in  a  comparable  group  the  pace 
was  determined  according  to  the  standard  instructions  of  the 
Seashore  pitch  test.  In  the  bell  test  the  experimenter  sat  at  the 
back  of  the  room  and  rang  the  bells.  He  was  able  to  watch  the  re- 
cording of  the  children  and  presented  the  stimulus  pairs  only  after 
the  children  had  raised  their  heads  slightly,  signifying  readiness 
for  the  next  interval.  The  instructions  in  these  tests  were  the  same 
as  those  used  in  the  preceding  experiment,  and  the  process  was  very 
similar  (Appendix,  p.  98). 

The  subjects  in  this  experiment  were  494  children  from  four 
elementary  schools  in  Des  Moines,  Iowa.  They  were  divided  into 
four  comparable  groups.  Groups  1  and  3  were  instructed  in  the  use 
of  the  "going  up  —  going  down"  terminology  and  Groups  2  and  4 
the  "higher-lower"  terminology.  Groups  1  and  2  took  the  test 
under  the  standard  Seashore  speed  and  Groups  3  and  4  set  their 
own  pace. 


TABLE  5 

Per  Cent  Meeting  Criterion  for  Two  Terminologies  (Going  Up  —  Going  Down 

and  Higher-Lower)    Presented  at  Standard  Pace    (Seashore   Pitch 

Eecord)   and  at  Slower  Variable  Pace   (Bells) 


Grade 

Children 
Tested 

Per  Cent 
Completing 
Test  Blank 

90  Per  Cent 

Successes  on 

Two  Easiest  Intervals  * 

Pi  o 

s  s 

C50 

^  S3 

13 
Ph  O 

PR 

bo  be 

.s.a 

o  o 

.J-,    o 

Going  Up — 
Going  Down 

Higher- 
Lower 

Per 
Cent 

Standard 
Error  of 
Per  Cent 

Per 

Cent 

Standard 
Error  of 
Per  Cent  \ 

Bells 

V 
IV 

III 

II 

21 
38 
38 
34 

26 
49 
27 
19 

95         81 
97         86 
76         85 
85         67 

95          .048           80          .078 
92         .044          64         .069 
63         .078          62         .089 
48         .086          22         .095 

Victrola 

V 

IV 

III 

II 

22 
38 
28 
34 

32 
43 
25 
20 

86         81 
82         88 
64         68 
54          20 

77         .090          72         .079 
73         .072          74         .067 
43          .094          48          .010 
21         .069          10         .067 

*  Twenty  trials  on  30  d.  v.  and  ten  trials  on  23  d.  v. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        29 


Figure  4.  Per  Cent  Scoring  90  Per  Cent  Success  on  Easiest  Intervals  of  Test 
(1)  When  * '  Going  Up  —  Going  Down  ' '  Terminology  Is  Combined  With 
a  Slower  Variable  Pace  and  (2)  When  "Higher-Lower"  Ter- 
minology Is  Used  With  the  Standard  Victrola  Pace 


Grade  V 

Grade  IV 

Grade  III 

Grade  II 

0.23 

3.15 

1.22 

1.34 

1.09 

2.29 

1.56 

1.91 

1.41 

1.22 

0.85 

♦ 

1.53 

0.64 

2.70 

0.05 

1.62 

0.98 

2.21 

0.13 

1.07 

0.30 

30  IOWA  yTlDIEIS  IN  CHILD  WELFARE 

Tables  5  and  6  with  the  accompanying  figure  (Figure  4)  present 
the  results  of  this  experiment.  Differences  are  at  once  apparent 
between  the  two  speeds  of  the  tests  when  the  ' '  going  up  —  going 
down"  terminology  is  used.  Mean  scores  when  using  the  slower, 
variable  pace  (for  both  UD  and  IIL  terminologies)  are  higher  than 
when  using  the  standard  speed  with  but  two  exceptions.  Further- 
more, mean  scores  when  using  a  combination  of  slow,  variable  pace 
with  "going  up  —  going  down"  terminology  are  higher  in  every 
case  than  Avhen  using  the  "higher-lower"  terminology  with  the 
standard  pace.  The  folloAving  tabulation  shows  the  significance  of 
these  differences  computed  from  Table  6 : 

Eatio  of  Difference  to  Standard 
Comparisons  Error  of  Difference 

Bells  (UD)*  and  Bells  (HL)** 
Bells  (UD)  and  Vietrola  (UD) 
Bells  (UD)  and  Vietrola  (HL) 
Bells  (HL)  and  Vietrola  (UD) 
Bells  (HL)  and  Vietrola  (HL) 
Vietrola  (UD)  and  Vietrola  (HL) 

*  UD  means  "froing  up  —  going  down.' 
**  HL  means   "higher-lower." 

A  larger  percentage  of  children  in  every  grade  completed  their 
papers  in  the  bell  test  than  in  the  Vietrola  test  when  using  the 
' '  going  up  —  going  down ' '  terminology.  A  larger  percentage  also 
succeeded  in  scoring  90  per  cent  correct  responses  on  the  easiest 
intervals  when  using  the  "going  up  —  going  down"  terminology. 
The  significance  of  differences  between  the  latter  percentages  is 
shown  in  the  following  tabulation  as  computed  from  Table  5 : 

Comparisons 

Bells  (UD)*  and  Bells  (HL)** 
Bells  (UD)  and  Vietrola  (UD) 
BeUs   (UD)   and  Vietrola   (HL) 
Bells   (HL)   and  Vietrola    (UD) 
Bells   (HL)   and  Vietrola    (HL) 
Vietrola  (UD)  and  Vietrola  (HL) 

*  UD  means  "going  up  —  going  down." 
**  HL  means  "higher-lower." 

The  most  significant  difference  in  this  experiment  is  that  between 
slower  variable  pace  Avith  the  "going  up  —  going  down"  term- 
inology and  the  standard  Vietrola  pace  using  the  "higher-lower" 
terminology.  In  every  grade  the  mean  score  and  the  percentage  suc- 
cess are  higher  for  the  former  than  the  latter.  (See  Figure  4  and  the 
preceding  tabulations.) 


Rat 

io  of  Differe 

nee 

to  Stan 

dard 

Error  of  Difference 

Grade  V  , 

Grade  IV 

Gr 

ade  III 

Grade  II 

1.63 

3.41 

0.08 

2.03 

1.67 

3.80 

1.64 

2.46 

3.19 

2.25 

2.79 

3.49 

0.25 

o!90 

1.47 

0.09 

1.08 

1.04 

1.48 

0.66 

0.42 

0.10 

0.51 

1.15 

MEASUREMENT  OF  MUSICAL  DEVELOPMENT        31 


o 

to 
C3 

OJ 

CO 


03 
CM 


C 
« 

CO 


1-4 

o 


be  n 

o 
o 

Ml  CB 

.5-!: 

c  *. 

"o   =« 


o  " 


m 

u 
o 

02 

o 


C 


C3 


-c  S  c 

LC  ao  t^  -*< 

00  CO  -* 

C   CO  •^_  ->1< 

r-i  T-i  T-i   CO 

t-  o  -^^ 

rA  r-I  ci 

-  t» 

i,    OJ  <M 

XO  o 

i.2 

««  c 

T3'*-'-2 

CO   03  O   fO 

O  '-£  00 

^§1 

rH  «C   LC   «C 

!0  O  rH 

L-"  ci  t-'  ci 

d  d  ci 

I— I 

^ 

S     T-*        ^H 

O 

rt  *;  -^ 

1^ 
|1< 

cc-p 

Oj 

tJD 

w 

C  0-.  l--  CO 

■■*<  c]  to  o 

a 

l.t_  -^^  GO  « 

rH  CV  L't   LO 

t-'  ci  CO  CO 

•^'  ir:?  t^'  lo 

^H 

-+  -^  -^  CO 

TtH    -r    CO    CO 

<< 

a» 

;-, 

-*  C3  «0  OC 

03  C3  I-  CI 

"^ 

01  Tt<  0) 

CJ  •*  CI 

O 

TJ    g    C 

rt 

tn 

"o 

IIS 

o 

IC  00  TJ  fo 

oo  ca  to  C3 

m 

rH  t>  t~  OC 

tj 

to  to  03  00 

*^     0^  14-1 

rH  O  <-!  r-I 

t^ 

rH  i-i  ci  r-i 

a:.  Q  o 

^ 

, 

o 

i.2 

«0  c 

^^.2 

t^  CO  LT  "-1 

■^  C  CO  -^ 

M 
C 

^    °    3 
«    J-    5 

CO  00  «5  't 

t~  to   t^   rH 

L'::  tjh'  o  o 

b-'  C3   r-i  d 

>— '       Q    rt— 

iH  rH 

r-I  r-I 

o 

A   "-r      U. 

O 

3"S^ 

& 

cc-p 

P 

fct 

a 

■^  O  '-I  '•'' 

t^  to  ci  CJ 

i-i  ic  <-;  °c 

•*  oc  L-:  00 

(3 

>^ 

t-'  t-:  d  oo" 

■^'  CO  to'  co' 

O 

f^ 

Tt<  Ttt  rtf  CO 

-rtl    Tt    CO   CO 

ID 

^ 

i-H  00   OC   fO 

rH  to  t~   00 

rS 

CI  CO  CO  fo 

CJ  CO  CJ  CJ 

""* 

o 

T3 

>>t^s 

>>  ^  HH  M 
1^  "^  l-H  hH 

►-^  f-H 

*-^    HH 

o 

32  IOWA  STUDIES  IN  CHILD  WELFARE 

In  general,  both  the  results  from  mean  scores  and  from  per- 
centages succeeding  on  the  test  indicate  that  the  "going  up  —  going 
down"  terminology  at  a  reduced  pace  (that  is,  with  use  of  the 
bells)  is  the  best  method  of  testing,  being  effective  with  over  90  per 
cent  of  the  fourth  and  fifth  grades  and  with  50  per  cent  or  more 
of  the  second  and  third  grades.  Thus  it  is  apparent  that  both 
terminology  and  pace  are  important  factors  in  determining  pitch 
responses. 

Before  leaving  this  section  it  is  interesting  to  note  that,  when  the 
standard  pace  was  used  with  both  the  ' '  going  up  —  going  down ' ' 
and  "higher-lower"  terminologies,  the  percentages  scoring  90 
per  cent  successes  on  the  easiest  intervals  in  both  this  and  the  pre- 
ceding experiment  are  very  similar.  (Compare  Tables  4  and  6.) 

Effect  of  Metlwd  of  Recording  on  Test  Performance 

In  group  tests  there  is  another  factor  which  may  have  an  im- 
portant influence  on  performance ;  this  is  the  manner  in  which  the 
child  records  his  responses. 

In  the  present  study  it  was  decided  to  have  the  subject  record  in 
rows  running  across  rather  than  down  the  page.  This  simplified 
instructions  since  the  child  could  be  told  to  "go  across  the  page 
just  as  you  do  in  reading  or  writing."  The  few  questions  asked  by 
the  subjects  during  the  test  and  the  scarcity  of  incomplete  papers 
indicate  that  these  directions  were  readily  understood.  Each  ten 
trial  row  was  separated  from  the  following  row  by  a  space  large 
enough  to  prevent  confusion  on  the  part  of  the  child  as  to  which 
row  he  was  using. 

Methods  of  Recording.  —  Letters,  the  usual  substitution  for  a 
verbal  response,  were  used  in  the  group  experiments  previously 
described.  However,  these  symbols  may  be  relatively  harder  for 
children  than  other  markings,  especially  in  the  first  few  grades. 
Crosses,  circles,  lines,  and  so  forth  were  suggested  as  being  simpler 
and  more  interesting.  The  three  methods  chosen  which  gave  more 
indications  of  profitable  returns  than  others  involve  three  some- 
what different  processes:  (1)  recording  one  of  two  possible  letters 
after  each  trial  (Form  A),  (2)  making  a  cross  (x)  in  one  of  two 
boxes  after  each  trial  (Form  B),  and  (3)  drawing  a  line  along  one 
of  two  paths  in  the  direction  of  the  last  of  two  sounds  (that  is, 
going  up  or  going  do^vn)   (Form  C). 

Companson  of  Letters  and  Crosses.  —  In  this  comparison  1,131 
children  in  grades  two  to  five  inclusive  served  as  subjects.  This 


MEASIIRP^MENT  OF  MUSICAL  DEVELOPMENT         33 

large  number  of  cases  was  secured  because  of  the  number  of  com- 
parisons involved. 

In  the  letter  method  the  subjects  were  told  to  record  H,  for  ex- 
ample, if  the  second  of  two  tones  was  higher  than  the  first,  or  L 
if  it  were  lower.  In  the  cross  method  they  were  told  to  put  a  cross 
(x)  in  the  higher  box  (demonstrating)  if  the  last  sound  was  higher 
or,  if  the  last  sound  was  lower,  to  put  a  cross  in  the  lower  box. 
Both  the  "higher-lower"  and  "going  up  —  going  down"  termin- 
ologies were  used  with  each  recording  form. 

Tables  7  and  8  show  the  results  of  this  comparison.  The  follow- 
ing tabulation  contains  data  on  the  significance  of  differences  be- 
tween mean  scores  made  when  using  letters  and  crosses:  (This  is 
computed  from  Table  8.) 


Katio  of  Difference  to  Standard 


Comparisons 

Letters  (UD)*  and  Letters  (HL)** 
Letters  (UD)  and  Crosses  (UD) 
Letters  (UD)  and  Crosses  (HL) 
Letters  (HL)  and  Crosses  (UD) 
Letters  (HL)  and  Crosses  (HL) 
Crosses  (UD)  and  Crosses  (HL) 

*  UD  means  "groing:  up  — ■  going  down 
**  HLi  means   "higher-lower." 


Grade  V 
0.45 
0.27 
0.43 
0.19 
0.05 
0.15 


Error  of  Difference 

Grade  IV 

Grade  III 

Grade  II 

1.33 

0.26 

1.02 

0.24 

0.02 

0.66 

2.07 

0.18 

0.35 

1.14 

0.26 

0.46 

0.76 

0.40 

1.28 

1.92 

0.15 

0.34 

TABLE  7 

Per  Cent  Meeting  Criterion  for  Two  Methods  of  Recording    (Letters  Versus 
Crosses)  and  Two  Terminologies  (Going  Up  —  Going  Down  and  Higher-Lower) 

Using  the  Seashore  Pitch  Record 


Grade 

Children 
Tested 

Per  Cent 
Completing 
Test  Blank 

90  Per  Cent 

Successes  on 

Two  Easiest  Intervals  * 

b£  be 

S  " 
'S'S 

Li 

bD  bJC 

s  s 

"o  ■© 
CO 

Going  Up — 
Going  Down 

Higher- 
Lower 

Per 

Cent 

Standard 
Error  of 
Per  Cent 

Per 
Cent 

Standard 
Error  of 
Per  Cent 

Letters 

V 
IV 
III 

II 

79 
76 
78 
68 

71 

79 
78 
69 

89          82 
79         73 
74         62 
68         62 

68          .052           89          .050 
61          .056           55          .047 
39          .055           48          .057 
28          .054           29          .055 

Crosses 

V 
IV 

III 
II 

79 
76 
78 
36 

71 

79 
78 
36 

87         85 
83         62 
72         63 
50         62 

75          .048           68          .034 
57           .044            44           .055 
36           .054            40           .056 
17          .063           21          .068 

Twenty  trials  on  30  d.  v.  and  ten  trials  on  23  d.  v. 


34 


IOWA  STUDIES  IX  CHILD  WELFARE 


e 

1 

is 

o 
O 

a 

C   t^   fC  Lt 

rH  C.  OC  O 

-^  ■'t  CC  CO 

w  ure  -*  ire 

o 

s  > 

'-'  *"  ""  ^^ 

I-!  T— '  r-i  1—' 

O 

OQfi  O 

Cm 

0.2 

W) 

o«  _ 

s 

-Cti.S 

tc  OC  t-  <-! 

t^  ire  ire  i-i 

'5 

t^  b-  »  rH 

r-j  o  -*  ure_ 

C 

rt      r-      S 

r-'  5J  r-5  C 

d  co'  M  ci 

N_^ 

^ 

^-1   r-f  T-i   T-H 

T-l    1— i    1—1 

00 

o 

cS-j; 

<c 

h3 

i!1e- 

'So 
o 

oc-> 

f— 4 

.s 

o 

c 

bjO 

's 

s 

c 

CO  C-.  tC  "-I 

CO  ■>*  to  CO 

O  CJ  00  ■^_ 

rH  cc  o  cc 

S 

cq  oc'  LC  ■* 

ci  d  Lre"  rH 

•^  CO  cc  cc 

^  CO   CO  CO 

-c  "o 

c  -*-* 

;;£ 

02     © 

h 

2 

jH  «0  OC  ® 

r-1  t^  rH  o 

V    >-i 

i^  t-  t^  ire 

t~  to  t~  -^ 

CO    o 

.^ 

K     r- 

O  "53 

u 

^*::  rt 

00     toCC 

n 

CO 

H     P2 

CD 
03 

►^1    s  -^ 

'H  §  o 

■4^ 

o 

TAB 

,ters  V 
Using 

t:.2  es 

«;  cc  c;  00 

•^  CO  C5  O 

O   CO  rH   C] 

r-i  I— 1  1-H  I— 1 

03  C5  CO  C-. 

rH  rH  rH  rH 

<"  /-s 

CQ(2,  o 

hJC 

^^  0^ 

K.^ 

.2^ 

i.2 

o 

O 

«C  c 

^^.2 

cc  ire  o  cv 

c;  CO  C-.  00 

fcC 

t^    c  •♦^ 

CO  oi  ire_  Ci 

00   -^   rH   Lre 

«w 

4;    !=    ? 

ci  rA  d  c{ 

C   d  rH  rn' 

••^ 

«  c  .-r; 

>— 1 1— 1 

r^  r-'.  y^  T-^ 

-1 

o 

a  ••~  u 

O 

lO    rt 

1 

OQ  V 

'O 

1 

r 

Q 

Cl, 

P 

;§ 

tx 

c 

CO  CO  c  ci 

cr.  C-.  «  C 

c 

c 

a: 

Hi 

00  o  Tj;  ire_ 

CO   rj-   CO  O 

is 

'o 

1»H 

cJ  o  ire'  c] 

cJ  d  Lre  r-^ 

f^ 

■<r  ■*  CO  CO 

■*  -S-  CO  CO 

*« 

o 

1-1 

>> 

a 

QJ 

»-^ 

>H 

'^ 

2 

«  ei  «  -- 

OC   CO  C-5  t^ 

.5 

•  l-l 

t-  t^  t-  X 

t>  t~  t-  CO 

1^ 

C3 

5 

> 

-C 

s 

05 

rt 

73 

c 

ce 

t-  > '-'  '-^ 

r*  ^  i_i  HH 

!>>;G^ 

o 

'-'hh 

'-'  hH 

s 

Error  of  Difference 

Grade  V 

Grade  IV 

Grade  III 

Grade  II 

0.01 

1.03 

1.21 

0.01 

0.97 

0.94 

0.09 

1.31 

0.00 

1.41 

0.11 

0.84 

0.91 

0.01 

1.28 

1.38 

0.02 

0.21 

1.14 

0.96 

0.88 

1.49 

0.37 

0.61 

MEASUREMENT  OF  MUSICAL  DEVELOPMENT        35 

The  significance  of  differences  for  percentage  successes  on  the 
easiest  intervals  follows:  (This  is  computed  from  Table  7.) 

Eatio  of  Difference  to  Standard 
Comparisons 

Letters  (UD)*  and  Letters  (HL)** 
Letters  (UD)  and  Crosses  (UD) 
Letters  (UD)  and  Crosses  (HL) 
Letters  (HL)  and  Crosses   (UD) 
Letters  (HL)  and  Crosses  (HL) 
Crosses  (UD)  and  Crosses  (HL) 

*  UD  means  "going  up  —  going  down." 
**  HLi  means  "higher-lower." 

It  is  apparent  that  no  significant  differences  exist  between  the  mean 
scores  made  or  the  percentages  succeeding  when  using  letters  or 
crosses  with  either  terminology^  ("going  up  —  going  down"  or 
"higher-lower").  There  is  a  slight  tendency,  however,  for  more 
correct  completions  on  tests  where  letters  were  used. 

An  analysis  of  questions  asked  by  children  indicates  that  the 
"higher-lower"  terminology,  when  used  in  conjunction  with  the 
cross  method  of  recording,  was  the  most  difficult.  Children's  state- 
ments also  indicated  that  the  test  was  too  fast  for  all  groups  (the 
Victrola  record  was  used  in  this  comparison).  The  children  pre- 
ferred the  cross  method  to  the  letter  method  despite  the  fact  that 
performance  was  slightly  better  on  the  latter. 

Since  the  letter  method  showed  so  little  superiority  over  the 
cross  method  and  since  the  children  preferred  the  cross  method 
(or  at  least  a  method  where  marks  rather  than  letters  were  used), 
it  was  decided  to  compare  the  letter  method  with  a  third  possibility. 

CmnpaHson  of  Letter  Method  With  Circle  and  Road  Method.  — 
In  this  comparison  424  children,  grades  two  to  five  inclusive,  from 
three  Des  Moines  elementary-  schools  served  as  subjects. 

Since  the  data  had  indicated  that  the  "going  up  —  going  down" 
terminology  Avas  more  effective  than  the  "higher-lower"  term- 
inology and  that  the  bells  rather  than  the  Victrola  record  were 
best  suited  for  testing  at  this  level,  the  present  comparison  was 
made  using  the  bells  for  sound  and  pace  and  the  ' '  going  up  — 
going  down"  terminology  for  instruction. 

By  using  the  bells  it  was  possible  to  employ  one  large  interval 
(30  d.v.)  throughout  the  test  rather  than  a  series  of  increasingly 
difficult  ones  as  used  in  the  previous  testing  comparisons. 

The  recording  form  (Form  C)  with  which  the  letter  method  was 
compared  is  known  as  the  circle  and  road  form.  The  subjects  were 


36 


IOWA  STUDIES  IN  CHILD  WELFARE 


3 


OJ      s 


tuor 


S-,    Si 
O   c3 


-^fe 


s 
O 


02    S 


PW  Oi 


0) 

Ul 

<i) 

o 

^ 

r^ 

02 

<; 

^H 

cn 

j^ 

o 
n1 

H 

o 

^ 

O) 

w. 

a< 

H 

©32 

of  o 


a; 


o 
o 

02 


O      !E 


a> 

;-i 

<u 

^ 

m 

•  r^ 

0) 

^ 

t-H 

CJ 

1-.    O    ►" 

o 


O  t~  Oi  M 
t>  t~  «)   t1< 


00  CO  t^  o 
CO  00  t~  ■>*l 


t^  -*  O  Tj( 

X   O  to  LO 

o  o  o  o 


p^ 


^3^  -:; 

?  o  S 

5  p 

02  WpL, 


IC  O  -^  M 
LT  LT  O  0-1 


LT  IC  C-]  l^ 

00  to  t^  lO 

o  o  o  o 


c 

ID 

a 

PU 


5's 


CO 

o 

t-:i 


Si 


OS 

O 


.-I  L-:  oc  a 

to  ■*  "^  1— I 


O  O  OC  00 
O  O  Oi  Oi 


O  O  00  to 
O  O  C.  Oi 


c<?  to  to  lO 


CO  Oi  '^  oo 

CO  u-:  to  -"fi 


1"^  I— ( 


MEASITREMENT  OP  MUSICAL  DEVELOPMENT 


37 


instructed  to  place  the  point  of  the  pencil  in  the  little  circle  and 
listen  to  the  two  sounds.  If  the  last  sound  was  (?oing  up,  they  were 
to  draw  a  line  upward  on  the  "up"  road;  if  the  last  sound  was 
going  down,  they  were  to  move  their  pencils  downward  on  the 
"down"  road.  Then  they  were  to  put  the  pencil  point  in  the  next 
circle  and  listen  for  the  next  sample. 

Tables  9  and  10  present  the  results  of  the  comparison.  From 
these  results  it  is  apparent  that  in  both  methods  from  the  second 
grade  on  practically  all  of  the  children  filled  out  their  papers. 
However,  the  per  cent  of  children  who  scored  90  per  cent  successes 
on  the  whole  test  is  relatively  low.^ 


TABLE  10 

Mean  and  Variability  of  Scores  for  Two  Methods  of  Recording  (Letters  Versus 
Circle  and  Road)  and  Two  Terminologies  (Going  Up  —  Going  Down  and  Higher- 
Lower)  at  a  Slower  Variable  Pace   (Bells) 


Grade 

Children 

Mean 

Standard  De- 
viation of  Dis- 
tribution 

Standard 
Deviation 
of  Mean 

Letters 

V 

IV 

III 

II 

33 
59 
63 
46 

37.81 
40.44 
40.35 
32.72 

8.91 
8.55 
9.96 
9.69 

1.56 
1.11 

1.26 

1.43 

Circle  and  Road 

V 
IV 
III 

II 

33 
61 
66 

57 

39.48 
39.86 
40.29 
32.86 

11.82 

11.10 

9.99 

10.02 

2.07 
1.42 
1.23 
1.34 

The  significance  of  differences  between  mean  scores  and  between 
per  cent  of  children  scoring  90  per  cent  successes  for  each  grade  is 
given  below : 


Groups  Compared   in   Terms 
of  Letter  and  Circle 

Grade  V 
Grade  IV 
Grade  III 
Grade  II 


Ratio    of    Difference    to    Standard 

Errors  of  Difference 

For  Means  For  Percentages 

.640  .490 

.320  .550 

.003  .640 

.070  .380 


While  a  larger  percentage  of  second,  third,   and  fourth  grade 
children  scored  90  per  cent  successes  on  the  whole  test  when  using 


8  It  may  be  that  the  monotony  caused  by  using  the  same  intervals  tended  to  lower 
attention  and  interest.  Another  fact  which  may  have  entered  into  the  differences  found 
between  these  groups  and  previous  groups  was  that  they  came  from  schools  in  a  section 
of  the  city  probably  lower  in  socio-economic  status. 


38  IOWA  STUDIES  IX  CHILD  WELFARE 

the  circle  and  road  method  than  when  using  the  letter  method 
(Table  9)  and  while  a  larger  percentage  of  fifth  grade  children 
was  successful  when  using  the  letter  method,  the  differences  are  not 
significant. 

Despite  the  fact  that  the  differences  between  methods  are  small, 
the  statements  of  the  children,  especially  in  the  lower  grades,  indi- 
cated a  greater  interest  in  the  circle  and  road  method  than  in  the 
letter  method.  It  was  decided,  therefore,  on  the  basis  of  interest  and 
objective  scores  (that  is,  per  cent  successful  on  the  whole  test)  to. 
use  the  circle  and  road  method  in  the  second,  third,  and  fourth 
grades  and  the  letter  method  in  the  fifth  grade  in  subsequent 
testing. 


'O* 


INDIVIDUAL  TESTS 

In  the  section  just  completed  it  has  been  shown  that  there  wer^ 
many  children  at  the  lower  levels  who  failed  to  understand.  There- 
fore, an  attempt  was  made  to  determine  whether  children  who 
failed  group  tests  or  those  below  the  level  where  group  tests  were 
possible  could  be  tested  by  the  concept  method  when  examined 
individually  and  thus  freed  from  many  of  the  complications  of  the 
group  setting.  Since  group  experimentation  indicated  the  easier  of 
two  concept  terminologies  to  be  "going  up  —  going  down,"  it  was 
used  in  the  individual  testing  which  follows. 

The  subjects  for  the  indi^ddual  testing  have  been  described  in  a 
preceding  section  (p.  23-24). 

The  central  question  involved  in  the  individual  testing  was  to 
determine  whether  the  children  understood  the  concept  involved. 
It  seemed  apparent  from  observation  that  the  five-year-old  child 
in  the  preschool  laboratory  understands  the  concept  of  "going 
up  —  going  down"  in  the  visual  field.  From  previous  experimenta- 
tion at  the  older  levels,  however,  it  seems  less  certain  that  the  same 
verbal  concept  would  be  understood  in  an  auditory  setting.  In  an 
attempt  to  clarify  this  question,  answers  to  the  following  specific 
problems  were  sought : 

1.  Do  children   really  understand  a   verbal  concept   in  the  visual 
field  at  as  early  an  age  (the  fifth  yeiiv)  as  observation  indicates? 

2.  Do  these  same  children  understand  the  same  verbal  concept  in 
an  auditory  setting? 

3.  If  a  verbal  concept   in   audition   is  not   developed  at   as  early 
an  age  as  that  in  the  \-isual  field,  what  is  the  age  relationship? 

4.  What   is   the   effect   of   training   on   concept  comprehension   in 
the  auditory  field? 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        39 

Com  pre  hois  ion  of  "Going  Up  —  Going  Down"  in  the  Visual  Field 
With  Minimal  Demanst rational  Training 

In  this  experiment  126  children  from  the  Iowa  preschools  and 
from  the  first  and  second  grades  of  the  University  Elementary- 
School  were  used  as  subjects.  The  children  were  brought  into  the 
testing  room  individually  and  seated  at  a  low  table.  A  small  tw^o- 
step  model  was  placed  before  the  subject,  on  each  step  of  which 
was  a  ''bell"  (described  in  Chapter  I).  The  bell  on  the  lower  step 
was  a  major  third  lower  in  the  musical  scale  than  the  one  on  the 
upper  step.  The  experimenter  told  the  child  to  watch  and  listen. 
' '  When  I  play  the  bells  like  this,  the  sounds  are  going  up  the  steps, 
aren't  they?  When  I  play  them  like  this,  they  are  going  down, 
aren't  they?  Now,  like  this  —  which  way?  That's  right.  And  like 
this?  That's  right.  Now  you  watch  me  and  listen  to  the  bells  and 
tell  me  whether  the  sounds  are  going  up  or  going  down.  Ready  ? " 

Each  child  was  then  given  a  short  ten  trial  test,  the  results  of 
which  are  shown  in  Table  11.  These  results  show  that  all  of  the 
children  five  years  or  more  of  age  could  easily  comprehend  the  di- 
rections while  watching  and  listening  and  could  use  the  verbal 
concept  "going  up  —  going  down"  perfectly.  Although  some  of 
the  responses  may  have  been  based  on  auditory  discrimination,  it 
seems  more  probable  that  vision  played  the  major  part  in  the 
responses. 

Comprehension  of  "Going  Up  —  Going  Down"  in  the  Auditory 
Field  With  Minimal  Demonstrational  Training 

The  second  experiment  was  carried  out  to  determine  whether  these 
children  could  use  the  verbal  concept  terminology  of  "going  up  — 
going  down"  in  the  auditory  field  alone. 

After  the  child  had  been  given  the  short  visual  demonstration, 
he  was  told  to  turn  in  his  chair  so  that  his  back  was  toward  the 
sound  source.  He  was  then  told  to  close  his  eyes  "so  he  could  hear 
very  well"  and  was  instructed  to  listen  to  the  bell  sounds.  If  the 
sounds  were  going  up,  he  was  to  respond  "going  up";  if  they  were 
going  down,  he  was  to  respond  "going  down." 

The  experimenter  rang  the  bells  and  recorded  responses  during 
the  test,  which  was  twenty-five  trials  long  except  for  those  who 
scored  nine  out  of  ten  correct  responses  on  the  first  ten  trials.  It 
required  but  a  short  time  to  present  the  test. 

Table  11  shows  the  results  of  this  test.  These  results  indicate 
that  while  in  the  visual  experiment  the  five-year-old  subjects  could 


40  IOWA  STUDIES  IN  CHILD  WELBWKE 


TABLE  11 

Ability  to  Use  a  Simple  Verbal  Concept  in  the  Fields  of  Vision  and  Audition 

as  Age  Increases 


r^li  il  r1»Tni-» 

Age,  Years 

1   4    1 

5    1 

6     1 

7 

8 

9 

10 

Vision 

Chronological  Age 

Tested                                                                21 
Per  cent  scoring  100  per  cent  successes 
on  ten  trial  visual  test                              57 

26 
100 

21 
100 

35 

100 

23 
100 

Mental  Age 

Tested                                                                  9 
Per  cent  scoring  100  per  cent  successes 
on  ten  trial  visual  test                                 44 

10 

100 

23 

100 

17 

100 

24 

100 

27 
100 

16 
100 

Audition 

Chronological  Age 

Tested                                                                21 
Per  cent  scoring  90  per  cent  successes 

when  using  verbal  concept  in  audition      10 

26 
23 

21 

24 

35 

80 

23 
65 

Mental  Age 

Tested                                                                  9 
Per  cent  scoring  90  per  cent  successes 

when  using  verbal  concept  in  audition        0 

10 

10 

23 

4 

17 

18 

24 
71 

27 

78 

16 

81 

readily  comprehend  and  use  the  terminology  ' '  going  up  — -  going 
down,"  in  the  auditory  experiment  only  one-fourth  of  the  five- 
year-old  group  could  use  the  terminology  correctly.  Even  at  eight 
years  of  age  not  all  of  the  subjects  in  the  group  were  successful. 

An  examination  of  Table  11  indicates  that  there  is  a  definite  rela- 
tionship between  age  and  ability  to  understand  the  concept  of 
' '  going  up  —  going  down ' '  when  used  in  either  the  visual  or  the 
auditory  field.  Figure  5  shoAvs  this  relationship  clearly.  The  rela- 
tionship in  the  auditory  field  is  especially  noticeable  when  mental 
rather  than  chronological  age  is  considered.  It  is  interesting  to  note 
that  there  is  a  marked  improvement  between  the  six  and  seven 
year  level  in  chronological  age  and  betAveen  the  seven  and  eight 
year  level  in  mental  age.  This  indicates  that  during  this  period 
there  may  be  a  sudden  increase  in  the  ability  to  comprehend  the 
concept  of  ' '  going  up  —  going  down ' '  in  the  auditory  field. 

The  Effect  of  Training  in  the  "Going  Up  —  Going  Down"  (Ver- 
bal) Concept  Upon  Performance  in  the  Auditor])  Field 

The  relative  difference  in  understanding  the  same  verbal  concept 
in  the  visual  and  auditory  fields  suggests  the  possibility  that  inabil- 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        41 


O  (J) 


o 

«0 


R 


o 


o 
in 


o 


o 


O 


(n 


00 


v9 


«ng 


/ 

/ 

5j 

3  -C 

.^^ 

-^L 

1 

1 

^- 

^ 

^'*^"> 

N 

\ 

CO 


v9 


m 
< 

i 

u 

o 
U)  o 

J 
o 

2 

o 

^? 


OOOoOoOOOOrt 


o 
a 
o 

»4 


CO 

5 

3 


0) 


o  « 

be  § 

ft  bo 
!^< 

bJOls 
C    ^ 

'S'So 


u 

o 
O 


o 


o 


0) 


be 


42 


IOWA  STUDIES  IN  CHILD  WELFARE 


ity  to  respond  correctly  in  the  auditory  field  may  be  due  to  a  lack 
of  experience  and  practice.  Therefore,  the  following  experiments 
were  carried  out  to  determine  the  effect  of  practice  in  learning  a 
verbal  concept  in  audition. 

Effect  of  Minimal  Training.  —  Two  groups  of  children,  (1)  those 
unable  to  understand  instruction  after  a  short  individual  test  and 
(2)  those  unable  to  pass  a  group  test  because  of  comprehension 
difficulties,^  were  given  a  period  of  short  individual  practice.  Each 
child  was  given  a  combined  test-practice-test  series  in  which  the 
following  procedure  was  used:  (1)  eight  demonstration  trials,^" 
(2)  ten  test  trials,  (3)  ten  practice  trials,  (4)  ten  test  trials,  (5) 
ten  practice  trials,  and  (6)  ten  test  trials.  Between  each  of  these 
series  there  was  a  short  pause  for  relaxation.  The  bells  were  used 
as  sound  stimuli,  thus  permitting  the  investigator  to  proceed  with 
the  test  at  a  pace  determined  by  the  child. 

TABLE  12 

Effect  of  Minimum  Individual  Practice  in  Learning  a  Verbal  Concejit  in  Audi- 
tion for  Sampling  of  Children  Failing  a  Group  Test 


Grade 


Children 

Tested 

Individually 


Per   Cent   Testable 


Individually 
Without 
Practice 


After  Ten 

Test  and  Ten 

Practice  Trials 


After  Twenty 

Test  and  Twenty 

Practice  Trials 


V 

(5 

66 

66 

66 

IV 

12 

36 

55 

55 

III 

14 

17 

22 

28 

II 

16 

6 

31 

37 

Tables  12  and  13  show  the  results  of  this  experiment.  In  the  first 
situation,  where  the  child  was  not  given  a  group  test  before  train- 
ing was  begun,  the  per  cent  of  subjects  succeeding  in  90  per  cent 
of  the  trials  at  the  several  age  levels  before  and  after  training  is 
relatively  low.  This  may  have  been  due  to  sampling  variations. 

In  the  latter  situation  the  children  were  first  presented  illus- 
trated instructions,  trial  tests,  and  repeated  instructions  between 
tests  in  the  group  situation  before  they  were  tested  individually. 
Only  those  were  tested  individually  whose  papers  showed  objective 


9  The  first  crroup  contained  children  from  St.  Patrick's  School  at  Iowa  City.  The  sec- 
ond g^roup  WHS  drawn  from  Des  Moines  schools  and  consisted  of  children  who  had  taken 
the  final  test  described  in  the  next  cliapter.  bnt  who  had  failed  to  pass  tlie  test.  Because 
of  time  limitations,  only  a  representative  sampling  of  failures  (forty-eisht)  -were  ffiveR 
individual    instruction. 

10  Demonstration  trials  were  not  given  the  children  who  failed  the  group  test.  These 
were  given  a  ten  trial  tost  immediately  upon  entering  the  individual  testing  room.  If  they 
passed  nine  out  of  ten  of  the  first  ten  trial  series,  they  were  tested  immediately  for  pitch 
discrimination. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT   43 


TABLE  13 

Effect  of  Minimum  Individual  Practice  in  Learning  a  Verbal  Concept  in 

Audition 


Children 

Tested 

Individually 

Per   Cent   Testable 

Age,  Years 

After   Intro- 
ductory 

After   Ten 
Test  and  Ten 

After  Twenty 
Test  and  Twenty 

Eight  Trials 

Practice  Trials 

Practice  Trials 

7 

S 

13 

25                              25 

8 

15 

13 

27                              27 

9 

18 

28 

39                              50 

10 

27 

7 

23                              41 

11 

12 

8 

25                              33 

evidence  that  they  did  not  understand  how  to  perform  successfully. 

Table  12  shows  that  a  considerable  percentage  of  children  (in- 
creasing with  age)  is  immediately  testable  when  they  enter  the 
individual  testing  room.  Their  failures  in  the  group  situation  may 
have  been  due  to  recording  difficulties,  group  distraction,  and  so 
forth.  Some  of  them  probably  understood  the  problem  toward  the 
end  of  the  group  test  and  hence  were  ready  to  be  tested  immediately 
by  the  time  of  the  individual  period.  On  the  other  hand,  it  may  be 
seen  that  a  certain  percentage  in  the  second,  third,  and  fourth 
grades  did  profit  from  the  short  individual  training  period.  Whether 
such  increases  in  percentage  testable  are  worth  the  effort  necessary 
in  obtaining  such  data  depends  upon  the  viewpoint  of  the  in- 
vestigator. 

Effect  of  Extensive  Training.  —  Since  the  short  individual  train- 
ing period  did  yield  a  certain  percentage  of  improvement,  the 
question  arose  as  to  whether  a  longer  period  of  training  might  not 
increase  this  percentage.  This  problem  is  taken  up  in  this  section. 

In  this  experiment  126  children  from  the  preschools  of  the  Iowa 
Child  Welfare  Research  Station  and  the  University  Elementary 
School,  ages  three  years,  six  months  to  eight  years,  five  months, 
were  given  an  initial  individual  test  to  determine  whether  they 
understood  the  "going  up  —  going  down"  terminology  in  audition. 
After  the  initial  testing  was  completed  group  training  was  begun. 
This  was  carried  on  by  the  regular  music  teacher  during  the  daily 
music  hour  for  a  period  of  five  weeks.  At  the  end  of  the  five  weeks  the 
children  were  again  tested  individually. 

Table  14  presents  the  results  of  the  test  both  before  and  after 
the  group  training  period.  These  results  indicate  that  the  greatest 
improvement  with  extensive  training  comes  between  the  years  five 
and  seven,  and  that  prior  to  and  after  these  years  there  is  little 


44 


IOWA  STUDIES  IN  CHILD  WELFARE 


TABLE   14 


Effect  of  Extensive 

Group  Practice 

in  Ijearning 

a  Simple  Verbal 

Concept   in 

A 

lulition 

Age,  Years 

Children 
Tested 

Per  Cent 

Testable 

Per 
from 

Cent  Gain 
Extensive 

Before 

After 

Training 

Training 

1 

raining 

Chroii 

jlogical  Age 

4 

16 

10 

19 

9 

5 

20 

23 

40 

13 

6 

21 

24 

52 

28 

7 

30 

80 

87 

7 

8 

21 

65 

71 

6 

Me 

ntal  Age 

4 

5 

0 

0 

0 

5 

7 

10 

14 

4 

6 

19 

4 

16 

12 

7 

16 

18 

38 

20 

8 

24 

71 

83 

12 

9 

23 

78 

91 

13 

10 

14 

81 

86 

5 

improvement.  It  is  interesting  to  note  tiiat  during  this  same  period 
the  greatest  improvement  without  extensive  training  also  occurs. 

In  general,  evidence  from  these  training  experiments  indicates 
that  practice  as  such  has  relatively  little  effect  on  ability  to  apply 
the  "going  up  —  going  down"  concept  in  the  auditory  field  before 
the  child  has  achieved  a  certain  mental  status.  What  improvement 
does  occur  comes  as  easily  with  a  minimum  of  practice  as  with  a 
longer  training  period  after  the  child  is  mentally  ready. 

Summary  of  the  Verbal  Concept  Technique  a3  a  Testing  Method 

The  results  from  group  tests  on  2,957  children  have  indicated 
that  various  factors  have  a  significant  effect  on  test  scores.  It  has 
been  shown  that  children  make  better  scores  when  using  instruc- 
tions involving  a  verbal  concept  terminology  of  "going  up  — •  going 
down"  than  when  using  "higher-lower."  They  also  make  better 
scores  when  the  speed  of  the  test  is  made  variable  and  set  to  the 
pace  of  the  group  tested. 

Children  do  not  make  significantly  different  scores  Avhen  using 
any  one  of  the  types  of  recording  methods  investigated  (letters, 
crosses,  or  drawing  a  line).  This  indicates  that  the  actual  use  of  a 
paper  and  pencil  test  is  probably  a  more  important  factor  than  the 
Avay  in  which  it  is  used.  Profitable  returns  from  a  paper  and  pen- 
cil test  are  not  found  in  the  second  grade  but  may  be  expected  in 
the  third,  fourth,  and  fifth  grades.   The  advisability  of  a   group 


MEASUREMENT  OP  MUSICAL  DEVELOPMENT        45 

pencil  and  paper  test  below  the  third  grade,  therefore,  is  question- 
able. 

In  regard  to  motivation  in  group  tests,  the  evidence  available 
(the  children's  statements  and  questions  and  the  number  of  com- 
pleted papers)  indicates  that  a  sustained  attention,  if  not  a  sus- 
tained interest,  was  present  throughout  the  tests  in  the  majority 
of  cases.  Modifications  of  pace,  method  of  recording,  and  terminol- 
ogy in  themselves  supplied  certain  motivating  influences. 

Individual  tests  of  pitch  discrimination  in  which  a  verbal  con- 
cept of  "going  up  —  going  down"  was  used  were  effective  as  sup- 
plementary tests  for  group  failures  in  grades  two  to  five  inclusive. 
The  percentage  of  young  children  four,  five,  and  six  years  of  age 
testable  individually  by  means  of  the  verbal  concept  decreases  rap- 
idly as  age  decreases.  There  is  a  sharp  drop  from  seven  to  five 
years  of  age.  Neither  a  minimal  nor  a  long  period  of  training  was 
effective  in  teaching  the  verbal  concept  of  "going  up  —  going 
down"  in  the  auditory  field,  although  the  same  concept  was  un- 
derstood in  the  visual  field  bv  all  children  at  five  years  of  age. 


CHAPTER  IV 

GEOUP  AND  INDIVIDUAL  TEST  RESULTS 

CHOICE  OF  INTERVALS 

The  intervals  in  the  Seashore  test  of  pitch  discrimination  are 
based  on  a  geometric  series  beginning  with  30  d.  v.  and  ending  with 
.5  d.  V.  (with  the  exception  of  the  two  finest  intervals,  w^hich  are 
rounded). ^^  Theoretically  each  interval  increases  in  difficulty  in  a 
psychophysical  fashion  with  ten  intervals  being  used  over  a  series 
of  100  trials. 

Data  regarding  the  relative  difficulty  of  the  intervals  of  the  Sea- 
shore test  are  to  be  found  in  Larson's  study  (10)/-  Figure  6  shows 
the  mean  scores  in  terms  of  per  cent  right  for  each  interval  of  the 
Seashore  pitch  discrimination  test  for  the  fifth  grade.  The  graph 
was  constructed  from  Larson's  data. 

Theoretically  the  curve  should  move  downward  in  a  more  or  less 
regular  manner.  This  appears  to  be  true  for  most  cases,  although 
some  intervals  are  not  as  regular  as  might  be  expected.  Since  the 
difference  between  8  and  5  d.  v.  is  not  as  great  as  that  between  5 
and  2  d.  v.  (Figure  6),  one  might  expect  to  secure  a  more  evenly 
distributed  and  possibly  a  more  discriminative  series  by  narrow- 
ing the  5  d.  V.  interval  to  4  d.  v.  On  the  basis  of  Larson's  data  it 
was  decided  to  use  the  intervals  30,  17,  8,  4,  and  2  d.  v.  It  was  felt 
that  such  a  series  w^ould  not  be  too  difficult  for  those  with  lesser  dis- 
criminative ability  and  not  too  easy  for  those  of  better  ability.  The 
curve  from  Larson's  data  including  only  these  intervals  is  given  in 
Figure  7.  The  value  for  4  d.v.  is  estimated  by  interpolation. 

NUMBER  OF  TRIALS 

A  relatively  short  group  test  is  necessary  when  dealing  w'ith 
young  children.  Statistically,  of  course,  this  would  not  seem  de- 
sirable, since  shortening  a  test  usually  reduces  its  reliability. 


11  A  revised  test  of  pitch  discrimination  for  the  Seashore  measures  of  musical  talent  is 
now  under  construction.  The  intervals  which  -will  be  used  in  this  test  for  adults  and 
older  children  are  20,  15,  11,  8,  6,  5,  4,  3.  2,  and  1  d.v. 

12  Larson's  study  was  undertaken  for  the  specific  purpose  of  "dealing  with  a  statis- 
tical treatment  of  the  Seashore  tests  to  include  revision  of  the  norms,  and  a  treatment  of 
reliability   and   validity."    (10.   p.   16) 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        47 


— p 

4 

in 

/ 

1 

1 

> 
■0 

/ 

1 

10 

> 
•0 

00 
•> 

\ 

■  ^ 

1 

4 

•   to 

1 

M 
.    O    u 

1 

.   ^  z 

11 

L 

i       §       8       g       s       s 

J.HO\a     XN30    a  3 

d 

a 


B 


^1 


o 

a 
m 


O 


> 
to 


~  CO 

<  lO 

=:  ^ 

o     „ 

o 

a  .-I 

a;  ^_' 

2  s 
.':3  o 

c3 

o 


o 


5 


3 

bc 


48 


IOWA  STUDIES  IN  CHILD  WELFARE 


1 

-  ^ 

/ 

N 

/ 

> 

-  -d 

. 

/ 

/ 

■  (0 

/ 

/ 

> 

_  -0 

/ 

/ 

1    1 

30  d.v. 
Intervals 

1 

1                     1 

9             < 

i        i 

5                 O                 O 
0                  t-                  vO 

o 
in 

J-N^D 

a3d 

a 

Eh 


CD 

5 


02 


=4-1 

o 


CO 


r5  o 

T3 

o 


=1-1 


o 


C3 
O 


OQ 


o 
S 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        49 

The  statistical  objection  to  a  shortened  test,  however,  holds  only 
under  certain  conditions,  as  has  been  pointed  out  by  Lanier  (9) 
and  Larson  (10).  In  a  study  with  adults  Lanier,  for  instance,  con- 
cluded that  the  Seashore  pitch  test  might  be  reduced  to  one-half  its 
present  length  and  still  be  as  reliable  as  the  whole  test.  Larson  (10, 
p.  31)  found  the  test  most  reliable  when  seventy  trials  were  used  in 
place  of  100. 

Previous  experimentation  in  this  study  indicated  that  a  fifty  trial 
test  could  easily  be  presented  to  children  if  broken  into  five  ten- 
trial  series  which  allowed  for  repetition  of  instruction  and  a  brief 
moment  of  relaxation.  A  test  of  greater  length  was  considered  in- 
advisable. It  was  decided,  therefore,  to  construct  a  test  within  a 
limit  of  fifty  trials.  The  final  test  consisted  of  five  intervals,  30,  17, 
8,  4,  and  2  d.  v.,  each  of  which  was  to  be  presented  to  the  children 
ten  times  in  a  random  order  previously  determined,  beginning  with 
the  easiest  interval  and  proceeding  to  the  most  difficult. 

EXPERIMENTAL  EVALUATION  OF  TEST 

Subjects 

The  test  was  given  to  715  children  in  the  first,  second,  third, 
fourth,  and  fifth  grades  of  three  Des  Moines  public  schools.  The 
children  were  tested  in  groups  ranging  in  size  from  twenty-three  to 
thirty-seven. 

Apparatus  and  Materials 

Six  metal  bars  as  described  above  were  tuned  to  the  following 
frequencies:  440,  442,  444,  448,  457,  and  470  d.  v.  (Chapter  I). 
Three  resonating  cylinders  set  in  wood  supports  were  tuned  so  that 
two  of  them  responded  to  a  maximum  degree  to  the  440,  442,  444, 
and  448  bars,  while  the  third  responded  maximally  to  the  457  and 
470  bars. 

A  printed  schedule  was  used.  The  exact  procedures  using  the 
"going  up  —  going  down"  terminology  and  the  recording  forms 
used  during  the  test  are  described  in  detail  in  the  next  section  of 
this  monograph  on  "A  Pitch  Discrimination  Test  for  Young  Chil- 
dren." 

OBJECTIVE  CRITERIA  OF  TEST  COMPREHENSION 

On  the  basis  of  results  already  reported  in  this  study  and  else- 
where, every  possible  aid  was  given  the  children  to  help  them  un- 
derstand  the   instructions  of   the   test.   A   preliminary   illustrated 


50  IOWA  STUDIES  IN  CHILD  WELFARE 

practice  period  was  held  before  every  test.  Before  a  single  trial  of 
the  actual  test  "vvas  presented,  the  experimenter  asked,  "Is  there 
anyone  who  has  any  questions?  Don't  be  afraid  to  ask  me  because 
I  want  you  all  to  be  sure  to  understand  so  you  can  get  good  scores." 
Not  until  all  the  children  indicated  that  they  understood  was  the 
test  begun. 

It  was  felt  that,  in  addition  to  the  above  precautions,  an  objec- 
tive check  should  be  made  on  whether  or  not  instructions  were 
understood.  To  make  this  check,  the  first  interval  in  the  test  was 
purposely  chosen  large  enough  (30  d.  v.,  approximately  a  musical 
semitone)  so  that  all  but  the  very  poorest  ears  could  hear  it  easily. 
The  following  criteria  of  comprehension  were  set  up:  (1)  at  least 
nine  of  the  first  ten  trials,  all  with  30  d.  v.  intervals,  should  be 
scored  correctly  (probability  of  success  here  is  1  in  512)  ;  (2)  there 
should  be  no  evidence  of  reversal  of  instructions,  as  for  instance  five 
or  more  consecutive  wrong  responses  on  any  ten  trial  series  in  the 
test;  and  (3)  all  papers  accepted  should  be  completed. 

The  following  tabulation  gives  the  proportion  of  children  meet- 
ing the  criteria  of  comprehension : 


Children 

Children  Meeting 

Per  Cent  Meeting 

trade 

Tested 

Criteria 

Criteria 

V 

136 

97 

71 

IV 

157 

100 

64 

III 

204 

102 

50 

II 

153 

48 

31 

I 

65 

4 

6 

It  may  be  well  at  this  point  to  compare  the  present  fifth  grade 
results  with  those  of  Larson  (10).  The  broken  line  in  Figure  8  is 
constructed  from  Larson's  data  and  shows  the  mean  scores  for  fifth 
grade  children  on  the  various  increments.  The  dotted  line  was  ob- 
tained similarly  from  the  fifth  grade  data  in  the  present  experiment. 
The  solid  line  was  obtained  from  the  same  group  as  that  repi'esented 
by  the  dotted  line,  after  the  objective  criterion  of  comprehension  had 
been  imposed. 

The  difference  between  the  curve  of  Larson  and  the  one  of  Hatt- 
wick  based  upon  what  may  be  called  the  observational  criterion 
may  be  said  to  be  due,  at  least  to  a  large  extent,  to  better  control 
in  the  present  test  of  group  factors  such  as  motivation,  test  proce- 
dure, length  of  test,  and  so  forth.  The  differences  between  the 
broken  line  and  the  solid  line,  however,  cannot  be  said  to  be  due  to 
such  factors  alone.  In  the  latter,  cases  lacking  in  test  comprehen- 


MEASTTREMENT  OF  MUSICAL  DEVELOPMENT        51 


/ 

\f 

/ 

/ 

X                                         4 

// 
// 

{ 

/ 

/ 
/ 

// 

/ 

1 

f 

1 

1 

/ 

t 

1 
1 

1 

1 

1 

i  1  i 

z  1  g 

i  1  f 

2    le     5 

if 
III 

1   1 

1  i 

/          i 
I          1 

\         1 
I       1 

/  / 

/  / 

'   / 

1 

/ 

1 

1 

o 
o 


o 


o 

CO 


g 


N 


K) 


m 


CO  00 


r- 1^ 


o 


o 


JLHOiy  XN30  a3d 


a 

O    O     liJ 

(in  w  k 

z 

o 

5 

»- 

,  t 
-I  X 


z 
o 


C5 


«*-i 
S 


^ 


02 

c  2 

CI  -^ 

o  o 

c  > 

o  u 

w  a> 

t~  *-> 

a  a 

^  § 

c  § 

O  fc- 

to  rg 

S 
o 
O 


00 

3 

be 


52  IOWA  STUDIES  IN  CHILD  WELFARE 

sion  have  been  objectively  eliminated.  The  differences  between  this 
curve  and  Larson's  raise  the  question  as  to  whether  the  factor  of 
comprehension  was  adequately  cared  for  in  her  procedures. 

In  the  present  study  the  objective  criterion  of  comprehension  was 
used  to  make  sure  that  the  only  tests  included  in  the  results  were 
those  which  were  fairly  valid  measures  of  pitch  discriminative 
ability.  The  results  which  follow  are  based  only  upon  those  cases 
passing  the  objective  criterion  of  test  comprehension. 

RESULTS 

Difficulty  of  the  Test  Intervals 

Table  15  and  Figure  9  show  the  mean  scores  made  by  each  grade 
on  each  of  the  five  intervals  chosen. 

In  each  grade  the  results  are  fairly  satisfactory.  With  the  pos- 
sible exception  of  the  second  interval  (17  d.  v.),  which  is  not  as 
difficult  as  might  have  been  expected,  the  intervals  show  relatively 
consistent  increase  in  difficulty.  However,  the  relatively  large  num- 
ber of  successes  on  the  17  and  8  d.  v.  intervals  indicates  that  the 
test  might  be  made  better  by  the  removal  of  one  of  these  intervals 
and  the  substitution  of  a  more  difficult  one. 

Table  15  shows  that  the  mean  scores  on  this  test  for  the  four 
grades  second  to  fifth  inclusive  are  41,  42,  42,  and  43  (means  round- 
ed) out  of  a  possible  50.^^  Although  these  mean  scores  are  higher 
than  had  been  anticipated  on  the  basis  of  previous  experimental 
work,  the  test  does  discriminate  fairly  well  as  shown  by  the  stand- 
ard deviations.  It  does  not  discriminate  sufficiently,  however,  to 
take  care  of  a  large  number  of  individual  cases ;  it  seems  probable 
that  a  better  test  would  result  if,  instead  of  intervals  of  30,  17,  8,  4, 
and  2  d.  v.,  a  series  of  30,  8,  4.  2,  and  1  d.  v.  was  used. 

Reliahilify 

The  total  number  of  children  '^  passing  the  criterion  of  test  com- 
prehension (that  is,  the  cases  used  for  reliability  purposes)  and  the 
reliabilities  obtained  ^■'  are  given  in  the  following  tabulation : 


13  It  should  be  noted  that  the  theoretical  range  is  25  to  50,  not  0  to  50. 

14  The  children  used  in  this  test  were  pupils  in  three  elementary  schools  in  Des 
Moines,  Iowa.  One  school  drew  its  students  largely  from  a  manufacturing  district,  an- 
other was  situated  in  a  university  district,  and  a  third  was  in  a  residential  district  com- 
posed largely  of  homes  of  office  and  shop  workers. 

15  These  reliabilities  were  obtained  by  the  chance  halves  method.  Spearman's  formula 
being  used  to  determine  the  final  reliability  of  the  whole  test    (4). 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        53 


Children 

Per  Cent  Passing 

Grade 

Tested 

Criterion 

r 

PE^ 

V 

136 

71 

.761 

.028 

IV 

157 

64 

.863 

.017 

III 

204 

50 

.879 

.015 

II 

153 

31 

.837 

.027 

I* 

65 

6 

Total 

715 

49 

*  The  meagre 

returns 

;  from  the 

first 

grade  made  further 

consideration 

of  group  testing 

at  this  level  impractical. 

The  reliability  of  the  test  for  all  grades  combined  was  .789  ±  .013. 

The  reliability  of  the  test  for  the  second,  third,  and  fourth  grades 
is  relatively  high,  while  that  for  the  fifth  grade  is  not  so  high  as 
might  be  hoped.  The  reason  that  correlations  are  higher  in  the 
earlier  grades  is  probably  due  to  the  fact  that  these  scores  showed 
greater  variability ;  the  test  was  somewhat  too  easy  for  the  fifth 
graders. 

Since  the  cases  included  in  these  reliabilities  quite  certainly  dis- 
criminated on  the  basis  of  pitch,  the  test  would  seem  to  be  a  re- 
liable and  valid  measure  of  pitch  discrimination  for  the  second, 
third,  and  fourth  grades.  Before  comparing  the  results  with  those  of 
other  studies,  it  will  be  worth  while  to  observe  the  reliabilities  which 
would  have  been  obtained  had  an  observational  rather  than  the 
more  objective  check  been  made  on  scores  to  be  included  in  the 
final  results.  This  information  was  obtained  by  determining  relia- 
bility for  all  cases  except  those  showing  obvious  difficulties  in  under- 
standing, as  evidenced  by  failure  to  complete  papers,  cheating, 
and  so  forth,  the  same  cheek  that  was  employed  by  previous  inves- 
tigators. The  following  tabulation  shows  the  results: 


Children 

Per  Cent  Usable 

Grade 

Tested 

for   r 

r 

PE^ 

V 

136 

98 

.893 

.012 

IV 

157 

96 

.923 

.008 

III 

204 

94 

.925 

.007 

II 

153 

67 

.861 

.017 

When  these  reliabilities  are  compared  with  those  above,  real 
differences  are  observable.  While  the  reliabilities  are  higher  accord- 
ing to  the  latter  grouping,  it  is  more  probable  that  the  test  is,  in 
part,  a  measure  of  comprehension  of  instruction  and  hence  a  less 
valid  measure  of  pitch  discrimination. 

All  of  these  data  point  to  the  conclusion  that  careful  wording  of 
instructions,  adequate  practice  trials  before  testing,  and  careful 
control  of  conditions  during  testing  are  not  enough  to  insure  ade- 
quate comprehension  and  correct  application  of  test  instructions. 


54 


IOWA  STUDIES  IN  CHILD  WELFARE 


pq 


o 
PQ 

C 
n3 

O 

.a 

s 

=4-1 
O 

(3 
a> 

'^ 
o 
(1 

o 
=1-1 


o 


o 

m 

3 
O 

5 

o 


CS 

> 

P 
C3 

S 
S3 
<a 


CO 

tand- 
dDe- 
ation 

r 1"  to  Oi 

13 

t~-_  lO  O  (M 

C<0   ■*'  L'i  "O 

t4 

o 

^^■> 

lO 

^ — 

c 

'O  lO  00  o 

1— 1 

C3 

Ol  Ol  iq  O 

(A 

o 

01 

ci  i-H  1— I  i-H 

^  ^<  ^  ^ 

Eh 

00  Ol  C-J  o 

cO-2 

^73  13 
"2  S> 

aO  CD  t~  CO 

> 

1— t   t— (   1— 4    1— ( 

?0  I-H   lO  t~ 

C 

c3 

•*  QO   Oi  t-; 

O  «j  -.O  LO 

i2  T3    c3 

O  OJ  •>!<  M 

CO  Ol  ai  Ol 

f— t    rH   1— (   I-H 

> 

=»    rt> 

'6 

CO  CI  >.-:  ■* 

c 

c3 

'■£>_  O  50  o 

S 

o."  t^'  t,'  t>.' 

flfi-2 

(M  Ol  -^  Ol 

i?  -«  la 
CO  ^-5 

M  CI  Ol  Oi 

> 

1-H  r-I  rH  1-H 

n3 

00 

t^  ■^  Ol  O 

j^ 

CJ  I-H  00  CJ 

oi  oi  oo'  oi 

CO  rl<  CJ  CO 

, 

Ol  O  lO  00 

> 
13 

«2--> 

r-5  i-i  i-I 

T— 1 

C 

c3 

■*  I-H  1*  to 

Ol 

»0  I-H  OJ  O 

1^ 

oi  oi  oi  oi 

-:t*  I.-  t>-  ^ 
O  CI  to  lO 

^fi-2 

jj  13    C3 

CO  CO  CO  Tfl 

'^ 

«2  rt'> 

o 

CO 

C 

o  o  t^  to 

CJ 

Ol  Ol  00  «>; 

oi  oi  oi  oi 

* 

s 

t-  O  CJ  00  •* 

fO 

Ol  O  o  <* 

f— 1 

1-H    I-H 

"^ 

O 

Ol 

13 

P4 

i-hIH*-^ 

o 

13 
O 
OS 

a 

II 

p. 

S 
o 


hH         CS 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT 


55 


■] 

/ 

-"    (S 

//  y 

/ 

/ 

h 

-  .      ^t 

4 

V 

'     N 

//' 

7 

..    (0 

•T 

/ 

!| 

.' 

A 

I'll 

-.    t^ 

!l 

^™ 

1 1/ 

ui 

UI    ui    uJ 

<  S  2 

'   1  / 

tf 

oc  «  a 

il 

0 

0  0  o 
tit 

s  a  t>4 

HI 

i 

Q  w 

-■   J?S  H 
J       z 

0              o              c 

>            0            9            o 

2            0^             « 

)                   ji                    VO                   LO 

J-HOia     1.N30    il3d 

£ 


la 

11 

r=  •- 
bC 

^^ 

_  o 

s  -^ 

O     CQ 

m  ^ 
^  • - 

Ph 
-t-3    oi 

a  j3 
o 

>■ 

tti  hH 
O    01 

Oi    en 

kH    O) 

03  O 


3 
bC 


56  IOWA  STUDIES  IN  CHILD  WELFARE 

Even  when  these  conditions  are  met  and  the  child  apparently  un- 
derstands instructions,  it  is  still  wise  to  provide  an  objective  check 
on  his  performance  before  deciding  that  the  test  is  a  true  measure 
of  discriminative  ability. 

Effectiveness  of  Present  Test 

It  is  significant  that  the  number  of  children  testable  at  the  several 
levels  drops  rapidly  as  one  goes  from  the  fifth  to  the  second  grade. 
This  is  undoubtedly  due  in  part,  however,  to  the  rigidity  of  the  test 
criteria  of  comprehension. 

The  criterion  of  comprehension  was  purposely  made  high  in  order 
to  include  only  those  cases  for  which  there  was  a  high  probability 
that  the  children  were  responding  to  pitch  differences  alone.  Such  a 
criterion  excludes  the  child  who,  though  he  did  not  understand  the 
instructions  at  the  beginning,  might  have  understood  the  test  bj^  the 
end  of  the  first  or  second  ten  trial  series.  Since  these  first  twenty 
trials  were  relatively  easy,  one  might  expect  a  number  of  instances 
of  this  type  to  have  occurred.  Before  judgment  can  be  passed  on 
the  test  as  an  effective  and  useful  measure  at  the  lower  age  levels, 
it  will  be  necessary  to  analyze  the  test  scores  further  in  an  effort  to 
determine  whether  the  actual  percentages  testable  are  not  higher 
than  suggested  here.  This  information  is  included  in  the  following 
section,  which  deals  with  causes  of  failure  in  group  tests. 

Causes  for  Failures  in  Group  Tests 

The  major  causes  for  failures  in  the  group  test  may  be  included 
under  four  categories:  (1)  inability  to  understand  instructions; 
(2)  misunderstanding  of  directions  (reversals)  ;  (3)  factors  of  in- 
attention, disinterest,  or  fatigue;  and  (4)  inability  to  discriminate 
pitch  differences. 

After  the  above  test  was  given  and  those  papers  selected  which 
satisfied  the  criterion  of  comprehension,  the  following  number  of 
papers  still  remained  which  were  tentatively  classed  as  failures : 


Grade 

Children  Failing 

V 

IV 

III 

II 

I 

Total 

Number 

28 

32 

60 

82 

60 

264 

Per  Cent 

20 

32 

46 

69 

93 

48 

Inability  to  Understand  Instructions.  —  An  analysis  was  made 
of  those  cases  which,  though  failing  the  criterion  only  on  the  initial 
ten  trials,  scored  at  least  nine  successful  trials  on  either  one  of  the 


[V 

Grades 

III 

Per  Cent 

II 

I 

68 

54 

31 

7 

5 

14 

13 

13 

73 

68 

44 

20 

MEASUREMENT  OF  MUSICAL  DEVELOPMENT        57 

second  or  third  ten  trial  series.  The  following  tabulation  contains 
this  material : 

Children  V 

Scoring  nine  successful  responses  on  first  ten 

trial  series  80 

Scoring  nine  succcessful  responses  on  second 

or  third  ten  trial  series  but  not  on  first  5 

Total  discriminating  on  at  least  the  last  thir- 
ty trials  of  the  test*  85 
*  No  papers  are  included  which  contained  reversals  of  concept  or  which  were  incom- 
plete. 

If  it  is  assumed  that  once  the  children  "got  the  idea"  of  the  test 
they  kept  it  throughout  the  period/'^  a  considerable  number  of 
cases  is  added  to  the  ranks  of  those  who  were  really  discriminating 
during  the  important  part  of  the  test. 

These  last  results  refiect  more  accurately  the  true  value  of  the 
test  as  a  diagnostic  measure.  Familiarity  with  the  test  as  it  pro- 
ceeds, learning  (or,  perhaps  better,  insight),  and  increased  atten- 
tion and  interest  as  the  child  comprehends  the  "idea"  are  prob- 
ably all  operating  throughout  the  testing  situation.  That  this  is 
especially  true  at  the  lower  levels  is  seen  in  the  increased  percent- 
ages of  children  Avho  started  to  discriminate,  or  at  least  to  report 
correctly,  after  the  initial  ten  trial  series. 

Although  these  cases  were  not  included  in  determining  the  re- 
liability of  the  test  or  in  setting  up  tentative  norms,  it  seems  that 
the  scores  of  those  who  "got  the  idea"  later  in  the  test  can  justi- 
fiably be  compared  w'ith  those  who  had  it  throughout  if  each  ten 
trial  series  preceding  the  successful  series  is  given  a  score  of  9.  In 
dealing  with  individual  cases  such  a  procedure  is  suggested. 

Reversals.  —  Another  group  of  cases  is  made  up  of  children  who 
consistently  recorded  5,  6,  7,  8,  9,  and  even  10  consecutive  re- 
sponses wrongly  out  of  a  possible  ten.  The  chances  that  a  child  could 
record  five  consecutive  responses  wrongly  by  guessing  are  1  in  32, 
for  six  consecutive  wrong  responses  1  in  64,  for  seven  1  in  128, 
and  so  forth.  Hence  it  seems  unlikely  that  these  consecutive  wrong 
scores  are  due  entirely  to  chance.  More  probably  they  are  due  to  a 
reversal  of  the  concept  terminology. 

In  the  present  test  the  following  percentages  of  children  were 


16  This  was  checked  objectively  by  noting:  whether  the  responses  after  the  saccessful 
ten  trial  series  continued  to  be  correct  or.  because  of  the  difficulty  of  the  following  in- 
tervals, whether  the  per  cent  right  fell  off  in  a  regular  manner.  Only  four  out  of  fifty-five 
cases  did  not  follow  one  of  these  sequences.  These  four  cases  were  discarded. 


58  IOWA  8TUDIE«  IN  CHILD  WELFARE 

found  to  score  five  or  more  consecutive  wrong  responses  on  one  of 

the  first  three  ten-trial  series: 

Grades 
Children  V         IV         III         II         I 

Per  Cent 
Reversals  (five  or  more  consecutive  wrong  re- 
sponses on  any  one  of  first  three  ten-trial 

series)  17  19  23  35      12 

Reversals  (nine  or  ten  consecutive  wrong  re- 
sponses on  any  one  of  first  three  ten-trial 
series)  4  8  9  9        0 

The  total  number  of  cases  showing  reversals  (126  cases)  repre- 
sents 18  per  cent  of  the  715  children  who  were  given  the  group 
tests. 

As  might  be  expected,  there  is  more  reversing  of  concept  as  the 
age  scale  decreases  except  for  the  first  grade,  where  the  test  was 
obviously  of  little  value.  This  occurrence  of  reversals  may  account 
for  the  fact  that  the  Larson  norms  begin  below  50  per  cent  right 
responses.  It  is  a  factor  which  must  always  be  considered  in  test- 
ing for  pitch  discrimination. 

Inattention,  Disinterest,  Fatigue.  —  There  is  a  third  class  of 
failures  due  to  such  factors  as  inattention,  disinterest,  fatigue,  and 
so  forth.  There  are  undoubtedly  some  scores  in  the  present  data 
which  are  low  because  of  such  factors.  Probably  very  few  scores 
were  affected  by  fatigue  in  the  physical  sense,  since  the  test  re- 
quired little  physical  effort  and  was  short.  All  that  can  be  said 
concerning  the  influence  of  these  factors  on  the  present  test  is  that 
they  seemed  to  be  fairlj'  well  controlled  as  evidenced,  for  example, 
in  the  reliabilities  obtained  on  the  test. 

Fourth  Class  of  Failures.  —  The  final  class  of  failures  contains 
those  children  who  could  not  discriminate  differences  as  large  as 
30  d.  V.  No  clear-cut  criterion  by  which  this  class  could  be  differ- 
entiated has  appeared  to  date. 

VALUE  OF  TPIE  PRESENT  GROUP  TEST  FOR  CHILDREN 
IN  FIFTH  GRADE  AND  UNDER 

One  of  the  objectives  of  this  investigation  was  that  of  building 
a  practicable  test  of  pitch  discrimination  for  younger  children. 
Three  considerations  will  be  discussed  in  this  connection :  economy 
of  time,  expense,  and  minimal  training  for  valid  testing. 

Economy  of  Time 

The  present  test,  having  but  five  intervals  and  a  total  of  fifty 


MEASUREMENT  OP  MUSICAL  DEVELOPMENT        59 

trials,  takes  approximately  seven  minutes  for  actual  presentation 
after  the  period  of  instruction  has  been  completed.  The  whole  period 
for  testing  and  instructions  seldom  lasts  more  than  twenty  minutes. 
It  is  quite  possible,  therefore,  to  double  the  length  of  the  test  where 
greater  reliability  is  desired. 

Cost  of  Materials 

As  the  test  is  now  constructed,  the  actual  cost  for  materials  is 
approximately  as  follows : 

6  Deagan  bars  $  9.00 

3  Resonators   (built  by  author)  3.00 

1  Rubber-headed  mallet  .25 

Recording  blanks  (per  500)  1.50 

Total  cost  $13.75 

There  remains  also  the  possibility  of  recording  the  test  on  a  phono- 
graph disc,  which  would  reduce  the  cost  considerably  for  general 
testing  purposes. 

Effectiveness  With  Minimum  of  Training 

The  present  test  permits  the  inclusion  of  at  least  90  per  cent  of 
the  fifth,  83  per  cent  of  the  fourth,  64  per  cent  of  the  third,  and  56 
per  cent  of  the  second  grades  as  testable  material  with  a  minimum 
of  training,^^  With  the  exception  of  the  first  grade,  profitable  re- 
turns may  be  expected  by  using  the  group  test. 

TENTATIVE  NORMS 

Cases  included  in  a  tentative  norm  summary  in  this  study  have 
been  accepted  only  after  an  objective  criterion  of  test  comprehen- 
sion was  satisfied.  Such  a  validating  procedure  resulted  in  more  and 
more  exclusions  as  the  grade  scale  descended.  The  following  tabula- 
tion shows  the  decline  of  available  cases  for  normative  purposes : 


Grade 

Cases                                     V 

IV 

III 

II 

I          Total 

Tested                                              136 
Used  for  Norms                             97 
Per  Cent  for  Norms                       71 

157 

100 

64 

204 

102 

50 

153 
48 
31 

65           715 
4           351 
6             49 

It  was  hoped  that  approximately 

100 

eases  would  be  available  for 

1"  These  percentages  were  obtained  by  adding  the  per  cent  of  group  test  failures  who 
were  testable  individually  after  a  minimal  training  to  the  per  cent  testable  in  me  group 
test.  Since  it  was  not  possible  to  test  all  of  the  group  failures  individually,  a  random  samp- 
ling of  forty-eight  was  taken  and  the  per  cent  of  these  who  became  testable  after  minimal 
training  was  used  as  an  approximation  of  what  testing  a  larger  group  might  indicate. 


60 


IOWA  STUDIES  IN  CHILD  WELFARE 


each  grade.  But  it  soon  became  apparent  that  this  was  practically 
an  impossibility  in  the  first  and  second  grades. 

Norms  are  also  made  difficult  because  the  scores  on  the  tests  are 
all  relatively  high  due  to  the  ease  of  the  first  twenty  trials.  The 
result  is  small  range  values  for  each  grade  (Table  16). 

Because  of  the  above  facts,  the  values  presented  in  the  tabula- 
tion on  page  73  of  Part  Two  of  this  monograph  are  given  for  only 
fifth  percentiles.  These  norms  are  presented  as  tentative  material. 

TABLE  16 


Group  Distribut 

ion 

in 

the  Final  Pit 

ch 

Test  of  Scores  for  Tentative  Nornas  for 

the 

Fifth,  Fourth, 

Third,  and  Second  Grades 

Eaw  Sco 

re 

Grade 

V 

IV 

III 

II 

50 

2 

3 

1 

1 

49 

4 

4 

4 

2 

48 

7 

8 

3 

1 

47 

7 

9 

7 

3 

46 

7 

10 

10 

3 

45 

8 

9 

13 

3 

44 

11 

6 

4 

3 

43 

9 

6 

5 

4 

42 

5 

3 

12 

6 

41 

9 

7 

6 

0 

40 

9 

12 

6 

6 

39 

8 

5 

7 

2 

38 

5 

7 

6 

5 

37 

2 

2 

2 

o 

36 

1 

1 

2 

1 

35 

1 

3 

3 

1 

34 

1 

2 

4 

0 

33 

1 

1 

1 

0 

32 

31 

1 

1 

30 

2 

1 

2 

29 

3 

1 

28 

1 

27 

1 

26 

Cases 

97 

100 

102 

48 

Mean 

42.95 

41.95 

41.58 

41.00 

Standard    Deviation                  3.77 

4.54 

5.06 

5.29 

GROUP  DISCRIMINATION  LIMENS 

Discrimination  as  Related  to  Age  and  Grade 

The  question  of  the  significance  of  the  differences  in  mean  scores 
for  the  different  grade  levels  is  of  theoretical  interest.  An  answer 
to  this  question  may  be  found  from  the  raw  data  by  determining 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        61 

whether  there  are  any  real  differences  between  the  group  scores  on 
the  hardest  intervals  of  the  test.  The  following  tabulation  shows 
that  there  are  no  significant  differences  between  mean  scores  made 
by  the  several  grades  on  the  4  d.  v.  interval ;  on  the  very  hardest 
interval  (2  d.  v.)  real  differences  begin  to  appear: 

Ratio  of  Difference  to  Standard  Error  of  Difference* 


Between  Total 

Between    Group 

Between    Group 

Comparison 

Group  Scores 

Means  on  4  d.v. 

Means  on  2  d.v. 

on  all  Intervals 

Interval 

Interval 

Grades  V  and  IV 

0.53 

0.04 

1.37 

Grades  V  and  III 

0.80 

0.08 

0.82 

Grades  IV  and  III 

0.38 

0.13 

2.30 

Grades  V  and  II 

1.31 

1.83 

2.31 

Grades  IV  and  II 

0.57 

1.72 

3.63 

Grades  III  and  II 

1.02 

1.75 

1.66 

*  Must  be  at  least  3.00  to  insure  reliability  of  differences  between  means. 

The  fourth  grade  scored  significantly  more  successes  on  the  hard- 
est interval  than  the  second  grade.  The  differences  between  the 
mean  scores  for  the  fifth  and  second  and  for  the  fourth  and  third 
grades  approach  significance ;  in  both  cases  the  higher  groups  made 
better  scores.  Since  these  results  were  obtained  from  cases  all  of 
whom,  according  to  objective  evidence,  understood  the  test  and  were 
responding  primarily  to  pitch  differences,  it  is  suggested  that  there 
are  slight  differences  in  discriminative  ability  at  the  different  grade 
levels  investigated  in  this  study. 

SUMMARY 

A  test  of  pitch  discrimination  for  younger  children  has  been  con- 
structed in  which  the  discriminative  intervals,  length,  and  instruc- 
tion, so  far  as  could  be  experimentally  demonstrated,  were  best 
adapted  to  children's  level.  The  test  was  given  to  715  children  in 
the  first  five  grades  of  school. 

Analysis  of  the  data  shows  that  the  test  was  a  reliable  and  valid 
measure  for  all  but  the  first  grade.  Although  it  was  not  applicable 
to  all  subjects  tested,  it  did  measure  the  abilities  of  56  per  cent  of 
the  second  grade,  64  per  cent  of  the  third,  83  per  cent  of  the  fourth, 
and  90  per  cent  of  the  fifth  grade  under  certain  conditions  (that  is, 
with  a  minimum  of  training).  It  was  most  effective  and  reliable  at 
the  third  and  fourth  grade  levels. 

The  most  important  limitation  of  the  test  is  that  its  discrimina- 
tive value  was  not  high  enough  at  the  easier  and  harder  ends  of  the 
scale.  A  suggested  alteration  of  the  test  to  improve  the  range  of 


62  IOWA  STUDIES  IN  CHILD  WELFARE 

difficulty  was  made.  The  suggested  change  should  not  affect  the 
reliability  or  validity  of  the  test,  except  perhaps  to  raise  the  relia- 
bility slightly. 

Although  individual  tests  are  recommended  below  and  perhaps 
including  the  second  grade,  it  was  shown  that  it  is  possible  to  de- 
termine by  means  of  a  group  test  the  relative  abilities  in  pitch  dis- 
crimination of  most  children  by  the  eighth  year. 

Finally,  the  objective  criterion  of  group  test  comprehension  used 
in  the  present  study  deserves  special  mention  since  it  clearly  showed 
the  unreliability  of  the  examiner's  observation  as  to  whether  or  not 
subjects  were  following  directions  carefully.  Data  secured  on  the 
basis  of  this  criterion  w-ere  used  in  deriving  tentative  norms.  These 
data  suggest  that  pitch  discrimination  may  improve  slightly  with 
age. 


CHAPTER  V 

GENETIC  GROWTH  OF  DIFFERENTIAL 
PITCH  SENSITIVITY 

NORMATIVE  SUMMARY  OF  DIFFERENTIAL  REACTIONS 

TO  SMALL  PITCH  DIFFERENCES  FROM  FOUR 

TO  TEN  YEARS  OF  AGE 

As  a  result  of  the  experiments  which  have  been  presented  in 
previous  sections,  it  is  now  possible  to  present  a  picture,  or  norma- 
tive summary,  of  certain  aspects  of  the  genetic  growth  of  differ- 
ential pitch  sensitivity.  Such  a  summary  brings  together  in  or- 
ganized form  the  broader  conclusions  of  this  study. 

Fmir-Y ear-Old  Children  ^® 

If  the  ''game"  is  made  interesting  enough,  the  four-year-old 
child  will  sing  readily.  Only  one  child  out  of  ten,  however,  has 
developed  sufficient  voco-motor  control  to  respond  in  the  right 
direction  to  simple  two  tone  intervals.  None  can  sing  intervals  ac- 
curately at  this  level. 

Although  five  out  of  ten  children  can  use  successfully  a  verbal 
concept  of  "going  up  —  going  down"  in  the  visual  field,  only  one 
in  ten  can  use  the  terms  successfully  in  the  auditory  field. 

Not  more  than  two  out  of  ten  children  at  this  level  can  report 
verbally  or  by  singing  whether  they  are  able  to  discriminate  large 
pitch  differences. 

Five-Year-Old  Children 

All  children  at  this  level  will  sing  readily.  Every  fifth  child  can 
sing  directionally  and  every  tenth  child  can  sing  accurately  the 
interval  heard.  Five-year-old  children  can  tell  the  experimenter  the 
difference  betAveen  "going  up  —  going  down"  in  the  visual  field. 
Only  tAvo  out  of  ten  before  extensive  practice  and  four  out  of  ten 
after  extensive  practice  are  able  to  report  successfully  in  the  audi- 
tory field. 


18  By  a  four-year-old  child  is  meant  a  child  from  three  years,  six  months  to  four  years, 
five  months  of  age.  The  five-year-old  child  refers  to  a  child  four  years,  six  months  to  five 
years,  five  months.  The  same  description  is  true  of  older  age  groups. 


64  IOWA  STUDIES  IN  CHILD  WELFARE 

Not  more  than  four  out  of  ten  children  at  this  level  are  testable 
for  pitch  discrimination  either  verbally  or  by  interval  singing. 

Six-Year-Old  Children  (First  Grade) 

All  children  at  this  level  will  sing  readily,  62  per  cent  will  sing 
directionally  what  they  hear,  and  14  per  cent  will  sing  accurately. 
Six-year-old  children  know  the  difference  between  ' '  going  up  — 
going  down"  in  the  visual  field,  but  only  two  out  of  ten  before 
extensive  practice  and  five  out  of  ten  after  extensive  practice  can 
use  these  terms  successfully  in  the  auditory  field. 

Slightly  over  half  of  the  children  at  this  age  level  are  testable 
for  pitch  discrimination  by  means  of  the  singing  technique,  with 
about  one-fourth  testable  by  means  of  the  concept  method. 

S even-Year-Old  Children  (Second  Grade) 

Seven-3^ear-old  children  will  attempt  to  sing  when  asked  to  do 
so.  Eight  out  of  ten  will  sing  directionally  while  four  out  of  ten 
will  sing  accurately  what  they  hear. 

Children  at  this  age  level  understand  the  verbal  concept  of 
"going  up  —  going  down"  in  the  visual  field.  Approximately  five 
out  of  ten  before  extensive  practice  and  six  out  of  ten  after  ex- 
tensive practice  can  use  the  terms  in  the  auditory  field.  There  is  a 
large  increase  in  ability  to  use  a  verbal  concept  in  the  auditory 
field  from  the  fifth  to  the  seventh  year.  From  the  seventh  year  on 
the  concept  becomes  more  effective  and  singing  less  effective  as 
methods  for  measuring  pitch  discrimination. 

Eight-Year-Old  Children  (Third  Grade) 

As  at  the  previous  levels,  all  children  will  sing  when  urged,  but 
there  are  beginnings  of  resistance.  The  child  begins  to  show  some 
signs  of  embarrassment,  saying  that  he  cannot  sing  very  well  or 
that  he  doesn't  like  to  sing.  Eight  out  of  ten  children  sing  direction- 
ally  in  the  approved  manner,  and  four  out  of  ten  sing  accurately. 

All  eight-year-old  children  understand  a  simple  verbal  concept  of 
"going  up  —  going  down"  in  the  visual  field.  Seven  out  of  ten 
before  extensive  practice  and  nine  out  of  ten  after  extensive  prac- 
tice can  use  such  a  concept  in  auditory  discrimination.  About  the 
same  number  of  children  are  measurable  by  either  the  singing  or 
verbal  concept  method;  the  latter  is  recommended. 

Group  testing  becomes  really  practical  for  the  first  time  at  this 
level,  a  majority  now  being  testable  by  this  means. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT         65 

Nine-Year-Old  Children  (Fourth  Grade) 

From  this  age  on  there  is  real  resistance  to  singing  in  the  case 
of  some  boys.  "I  can't  sing"  accompanied  by  a  negativistic  atti- 
tude is  not  uncommon.  The  attitude  does  not  represent  the  true 
picture,  however,  since  eight  out  of  ten  children  are  able  to  sing 
directionally  what  they  hear,  half  of  that  number  being  able  to 
sing  accurately. 

Eight  out  of  ten  nine-year-old  children  before  practice  and  nine 
out  of  ten  after  practice  are  able  to  tell  whether  the  sounds  are 
going  up  or  down  with  no  embarrassment  or  resistance.  The  verbal 
concept  method  is  especially  recommended,  therefore,  for  measur- 
ing purposes  at  this  level,  with  the  singing  method  as  a  supple- 
ment in  individual  cases. 

Ten-Year-Old  Children  (Fifth  Grade) 

Only  one  child  out  of  ten  is  unable  to  sing  directionally  what  he 
hears  at  ten  years  of  age ;  six  out  of  ten  sing  accurately  what  they 
hear.  It  is  difficult,  however,  to  secure  cooperation  with  the  singing 
technique. 

Eight  out  of  ten  at  this  level  understand  the  verbal  concept  in 
audition  with  preliminary  instructions ;  nine  out  of  ten  understand 
the  concept  after  a  minimal  training  period.  Either  method,  the 
singing  or  the  verbal  concept  method,  is  possible  for  individual 
measurement  of  ten-year-olds.  Group  measurement  (using  the  con- 
cept technique)  is  recommended  in  practically  all  cases  with  indi- 
vidual tests  for  only  those  children  failing  the  group  test. 

IMPLICATIONS  OF  RESULTS 

As  far  back  as  the  present  investigator  has  been  able  to  "push" 
techniques  of  differential  measurement  successfully,  there  is  evi- 
dence in  some  cases  of  very  fine  pitch  discriminative  ability  at  the 
fifth  year  level.  Additional  evidence,  however,  indicates  (so  far  as 
irrelevant  factors  could  be  objectively  controlled)  that  there  ap- 
pears to  be  some  improvement  in  discriminative  ability  with  age 
when  large  numbers  of  children  were  tested  and  average  ability 
was  determined. 

In  dealing  with  children  it  is  concluded  that  the  principles  of 
psychophysical  procedures  can  be  fairly  closely  adhered  to,  but  that 
special  precautions  are  necessary.  By  using  fewer  increments,  by 
shortening  the  test,  by  introducing  aids  and  objective  checks  on  the 


66  IOWA  STUDIES  IN  CHILD  WELFARE 

cognitive  aspects  of  discrimination,  and  by  introducing  proper  moti- 
vation, it  was  possible  to  present  to  children  a  series  of  paired  tones 
differing  by  small  amounts  and  to  obtain  results  which  can  be  treated 
by  acceptable  procedures. 

To  the  field  of  musical  development  the  contribution  here  has 
been  limited  since  only  one  small  section  of  a  larger  area  has  been 
investigated,  that  is,  differential  pitch  sensitivity.  A  genetic  pic- 
ture of  this  one  aspect  of  musical  development,  however,  has  been 
presented.  It  has  been  shown  that  measurement  of  pitch  discrimi- 
nation is  possible  with  some  few  children  as  early  as  the  fourth 
year  and  that  the  majority  of  children  are  testable  individually  by 
the  first  year  of  school  and  in  groups  by  the  third  school  year.  It  is 
not  so  much  the  specific  finding  as  the  fact  that  practical  meas- 
urement is  possible  at  an  early  age  which  may  be  a  contribution 
to  the  field  of  musical  development. 

Finally,  it  appears  that  other  tests  of  sensory  capacities  pat- 
terned in  approach  after  the  present  one  are  now  possible  for  chil- 
dren. It  should  be  only  a  matter  of  time  until  such  tests  of  musical 
capacity  are  available  for  those  who  desire  them. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        67 
REFERENCES 

1.  Allen,  G.  L. :    Tests  in  pitch   discrimination   of  noimal  and   feebleminded 

children.  Traininj;;  School  Bull.,  1923,  20,  1-8;   18-23. 

2.  Baldwin,  Bird  T.,  Fillmore,  Eva  A.,  and  Hadley,  Lora:  Farm  children:  An 

investigation  of  rural  child  life  in  selected  areas  of  Iowa.  New  York: 
Appleton,  1930.  Pp.  xxii,  337. 

3.  Dashiell,  J.   F. :    A   comparison   of  complete   versus  alternate   methods   of 

learning  two  habits.  I'sychol.  Rev.,  1920,  27,  112-135. 

4.  Garrett,   Henry   E. :    Statistics   in   psychology  and   education.   New   York: 

Longmans,  Green,  1926.  Pp.  xiii,  317. 

5.  Gilbert,  J.  Allen:  Researches  on  school  children  and  college  students.  Univ. 

Iowa  Stud,  in  Psychol.,  1897,  1,  1-39. 

6.  Grigsby,  Olive  John:   An  experimental  study  of  the  development  of  con- 

cepts of  relationship  in  preschool  children  as  evidenced  by  their  ex- 
pressive ability.  State  University  of  Iowa,  Unpublished  Doctor's  dis- 
sertation, 1932,  Pp.  165;  also  J.  Exper.  Educ,  1932-1933,  1,  144-162. 

7.  Hattwick,   Melvin    S.:    The   role   of   pitch    level   and   pitch    range   in   the 

singing  of  preschool,  first  grade,  and  second  grade  children.  Child 
Develop.,  1933,  4,  281-291. 

8.  Hissem,  I.:  A  new  approach  to  music  for  young  children.  Child  Develop., 

1933,  4,  308-317. 

9.  Lanier,  Lyle  H.:  Prediction  of  the  reliability  of  mental  tests  and  tests  of 

special  abilities.  J.  Exper.   Psychol.,  1927,   10,  69-113. 

10.  Larson,   Ruth    Crewdson :    Studies   on    Seashore's    "Measures   of   Musical 

Talent. ' '  Univ.  Iowa  Stud.,  Series  on  Aims  and  Progi-ess  of  Research, 
1930,  2,  No.  6,  Pp.  83. 

11.  McGinnis,  Esther:  Seashore's  measures  of  musical  ability  applied  to  chil- 

dren of  pre-school  age.  Amer.  J.  Psychol.,  1928,  40,  620-623. 

12.  Meissner,  Herbert:    Zur   Entwicklung  des   "Musikalischen   Sinnes"   beim 

Kinde,  Wahrend  des  schulpiiichteigen.  Berlin:  Trowitasch  &  Sohn,  1915. 
Pp.   62. 

13.  Miles,  Walter  R. :   Accuracy  of  the  voice  in  simple  pitch  singing.   (Univ. 

Iowa  Stud,  in  Psychol.,  No.  6)  Psychol.  Monog.,  1914,  16,  No.  3,  13-66. 

14.  Montessori,   Maria :    The  Montessori   elementary   material :    The   advanced 

Montessori  method.  Trans,  by  Arthur  Livingston,  New  York:  Frederick 
A.  Stokes,  1917.  Pp.  xviii,  164. 

15.  Pavlov,    Ivan    Petrovich:    Lectures    on    conditioned   reflexes;    Twenty-five 

years  of  objective  study  of  the  higher  nervous  activity  (behavior)  of 
animals.  Trans,  by  W.  Horseley  Gantt  and  G.  Volborth.  New  York: 
International  Publisliers,   [c.  1928]   Pp.  414. 

16.  Reymert,  Martin  L. :   The  development  of  a  verbal  concept  of  relationship 

in  early  childhood.  Scand.  Scient.  Rev.,  1923,  2,  No.  2,  32-83. 

17.  Root,    A.    R. :    Pitch    patterns   and    tonal    movement    in    speech.    Psychol. 

Monog.,   1930,  40,  No.   1,   109-159. 

18.  Seashore,  C.  E. :  The  measurement  of  pitch  discrimination:  A  preliminary 

report.  Psychol.  Monog.,  1910-1911,  13,  No.  1,  21-60. 

19.  Seashore,  Carl  Emil:  The  psychology  of  musical  talent.  New  York:  Silver 

Burdett,  1919.  Pp.  xvi,  288. 

20.  Stubbs,  Esther  M. :    The  etfeet  of  the  factors  of  duration,   intensity,  and 

pitch  of  sound  stimuli  on  the  responses  of  newborn  infants.  [In]  Irwin, 
Orvis  C,  Weiss,  LaBerta  A.,  and  Stubbs,  Esther  M.:  Studies  in  Infant 
Behavior  I.  Univ.  Iowa  Stud.,  Stud,  in  Child  Welfare,  1934,  9,  No.  4, 
Pp.  175.  (p.  75-135) 


68  IOWA  STUDIES  IN  CHILD  WELFARE 

21.  Town,  Clara  H.:   Analytic  study  of  a  group  of  five-  and  six-year-old  chil- 

dren. Univ.  Iowa  Stud.,  Stud,  in  Child  Welfare,  1921,  1,  No.  4,  Pp.  87. 

22.  Williams,  Harold  M. :    Experimental  studies   in   the  use   of  the  tonoscope. 

(Univ.  Iowa  Stud,  in  Psychol.,  No.  14)  P.sychol.  Monog.,  1931,  41,  No. 
4,  266-327. 

23.  Williams,  Harold  M. :  Studies  in  the  measurement  of  musical  development. 

[In]  Williams,  Harold  M.,  Sievers,  Clement  H.,  and  Hattwick,  Melvin  S. : 
The  Measurement  of  Musical  Development.  Univ.  Iowa  Stud.,  Stud,  in 
Child  Welfare,  1932,  7,  No.  1,  Pp.  191.  (p.  9-107) 


PART  TWO 

MANUAL  OF  INSTRUCTIONS  AND  INTERPRE- 
TATIONS FOR  A  PITCH  DISCRIMINATION 
TEST  FOR  YOUNG  CHILDREN 

BY 

Melvin  S.  Hattwick 


MANUAL  OF  INSTRUCTIONS  AND  INTERPRE- 
TATIONS FOR  A  PITCH  DISCRIMINATION 
TEST  FOR  YOUNG  CHILDREN 

Part  One  of  this  monograph  has  presented  an  experimental 
evaluation  of  the  methods  applicable  in  testing  pitch  discrimination 
of  young  children  aged  four  to  ten  years  approximately.  The 
present  section  presents  a  recommended  procedure  for  group  and 
individual  testing  in  the  second,  third,  and  fourth  school  grades. 
These  are  the  levels  at  which  the  reliability  of  the  test  is  known 
to  be  relatively  high. 

MATERIALS  NEEDED 

As  constructed  at  present,  the  sound  apparatus  consists  of  three 
resonators,  six  metal  bars  (tuned  to  the  following  frequencies:  440, 
442,  444,  448,  457,  and  470  d.  v.),  and  a  small  rubber  hammer  or 
mallet.  These  are  described  in  detail  in  Part  One  of  this  monograph, 
page  13.  Of  the  intervals  produced  by  this  apparatus,  30,  17,  8,  4, 
and  2  d.  v.,  some  are  easy  enough  for  the  poorest  listener,  yet  others 
are  difficult  enough  for  all  but  the  best  listeners. 

For  group  testing  Form  C  as  described  on  page  32  is  the  recom- 
mended record  blank.  This  form  is  easilv  and  economicallv  mimeo- 
graphed. 

INSTRUCTIONS 

The  examiner  must  be  thoroughly  familiar  with  the  mechanics 
of  the  test.  Several  periods  of  practice  in  striking  the  bars  are 
necessary  before  it  is  possible  to  produce  two  tones  that  do  not 
differ  appreciably  in  intensity.  The  bar  should  be  struck  so  that  a 
medium  loud  tone  is  heard.  Each  tone  should  last  approximately 
one  second,  then  should  be  damped  by  the  finger.  The  second  tone 
should  be  sounded  as  soon  as  the  first  is  silent. 

The  examiner  should  sit  at  the  rear  of  the  testing  room  when 
giving  the  test  and  should  adapt  the  pace  of  the  test  to  the  slowest 
section  of  the  group.  After  the  interval  has  been  presented,  he 
should  watch  until  all  heads  are  raised  slightly.  It  is  well  to  have 


72  IOWA  STUDIES  IN  CHILD  WELFARE 

the  regular  teacher  at  the  front  of  the  room  to  watch  for  copying, 
inattention,  etc. 

The  following  specific  instructions  should  be  followed  as  closely 
as  possible : 

1.  After  the  children  have  filled  in  name  and  so  forth  on  the  blank,  in- 
struct them  as  follows:  "I  want  you  to  listen  carefully.  This  is  to  be  a  hear- 
ing game.  You  will  hear  two  sounds,  one  right  after  the  other.  Sometimes 
the  last  sound  will  be  going  up,  and  sometimes  the  last  sound  will  be  going 
down.  Listen."  Demonstrate  going  up  and  going  down.  "Now  if  the  last 
sound  is  going  up  you  draw  a  line  up  the  up  road,  like  this."  Demonstrate 
on  Form  C  reproduced  on  the  blackboard.  "Or  if  the  last  sound  is  going 
down,  you  are  to  draw  a  line  down  the  down  road,  like  this. ' '  Illustrate. 

2.  Then  give  preliminary  practice  on  the  largest  interval,  allowing  the 
listeners  to  answer  aloud  as  a  group.  Eecord  correct  responses  on  the  form 
on  the  blackboard  at  the  front  of  the  room.  Continue  preliminary  practice 
until  all  questions  are  answered.  Eepeat  instructions  once  more  just  before 
the  test  is  begun.  That  the  demonstration  and  preliminary  exercise  must 
be  given  carefully  and  adequately  cannot  be  emphasized  too  much. 

3.  Pause  at  the  end  of  each  ten  trial  series.  Have  the  children  relax  for 
a  moment  and  commend  or  caution  them  depending  upon  their  group  at- 
tention. Kemind  them  that  ' '  the  next  row,  row  number  — ,  will  be  a  little 
bit  harder  than  the  row  you  just  finished.  8o  sit  up  straight,  listen  very 
quietly,  and  see  if  you  can  get  them  all  right.  Kemember,  if  the  last  sound 
is  going  up  you  go  up  the  up  road.  If  the  last  sound  is  going  down  you 
go  down  the  down  road.  Ready?" 

4.  If  the  question  is  raised  concerning  the  likeness  of  sounds,  tell  the 
children  that  there  is  always  a  difference. 

COMPUTATION  AND  INTERPRETATION  OF  RESULTS 

The  following  key  gives  the  standard  order  of  stimuli  for  each 
trial ;  it  also  serves  as  a  scoring  key.  The  degree  of  difficulty  of  each 
interval  in  terms  of  d.  v.  and  per  cent  of  a  whole  tone  is  indicated 
at  the  left  of  each  row.  All  papers  should  be  checked  by  this  key, 
the  score  for  each  child  being  the  total  number  of  trials  correct 
multiplied  by  two. 


Order  of  Stimuli* 
UUDUDDUUDD 
UDDDUUDDDU 
DUDDDUUDDD 
UUDDUDDUDU 
DUUDUUDUDU 


Pitch  Key 

ow 

Per  Cent  of  Wlio'le  Tone 

1 

56      (30  d.v.) 

2 

31      (17  d.v.) 

3 

14      (   8  d.v.) 

4 

7      (   4  d.v.) 

5 

4     (   2  d.v.) 

U  means  up;    D  means   down. 


All  test  papers  which  (1)  are  incomplete,  (2)  have  five  or  more 
consecutive  wrong  responses  on  any  ten  trial  series,  or  (3)  have  less 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        73 

than  nine  correct  responses  on  the  first  ten  trial  series  should  be 
rejected.  Such  papers  are  not  considered  valid  measures.  (See  Part 
One  of  this  monograph,  page  50.)  Children  who  fail  the  group 
test  may  be  tested  individually  later. 

All  scores  are  interpreted  in  terms  of  percentile  rank.  Such  a 
rank  indicates  the  position  which  the  child  holds  in  a  group  on  a 
scale  of  100,  100  representing  the  highest  rank,  0  the  lowest,  and 
50  the  average.  Since  it  is  theoretically  impossible  for  any  individ- 
ual to  have  a  percentile  rank  of  100  or  0,  these  are  omitted.  Be- 
cause of  the  small  number  of  cases  the  norms  of  rank  presented 
here  are  merely  tentative. 

The  following  tabulation  presents  the  norms.  The  first  column 
gives  percentile  ranks  for  each  fifth  percentile.  The  second,  third, 
and  fourth  columns  represent  the  scores  in  per  cent  right  for  the 
several  grades.  For  example,  if  a  second  grade  child  made  a  per  cent 
score  of  90  (score  of  45  on  test  multiplied  by  2) ,  he  would  rank  at  the 
eightieth  percentile. 

Tentative  Norms  for  Pitch  Scores 
Score,  Per  Cent  Right 


Percentile  Rank 

Grade  II 

Grade  III 

Grade  1 

99 

100 

100 

100 

95 

96 

97 

98 

90 

94 

94 

96 

85 

92 

92 

94 

80 

90 

91 

92 

75 

88 

90 

91 

70 

87 

90 

90 

65 

85 

89 

89 

60 

84 

88 

88 

55 

83 

86 

87 

50 

82 

84 

86 

45 

81 

83 

85 

40 

80 

82 

82 

35 

79 

80 

81 

30 

76 

78 

80 

25 

74 

76 

78 

20 

72 

72 

76 

15 

70 

71 

75 

10 

68 

68 

74 

5 

64 

64 

68 

1 

56 

54 

58 

INDIVIDUAL  TESTS  OF  PITCH  DISCRIMINATION  FOR 

YOUNG  CHILDREN 

Approximately  half  of  the  children  at  the  second  grade  level  as 
well  as  many  in  the  third  grade  are  not  testable  by  the  group 
method.  They  do  not  complete  their  papers,  have  five  or  more  con- 


74  IOWA  STUDIES  IN  CHILD  WELFARE 

secutive  wrong  responses  on  any  ten  trial  series,  or  have  less  than 
nine  correct  responses  on  the  first  ten  trial  series. 

After  the  group  test  papers  are  scored  the  examiner  may  test 
individually  those  who  failed  the  group  test.  The  same  instructions 
are  used  in  the  individual  test  as  in  the  group  test.  Now,  however, 
the  child  is  relieved  of  group  distractions  and  the  necessity  of 
recording,  since  the  examiner  does  the  recording.  The  individual 
test  should  be  given  in  a  smaller  room  away  from  distractions  and 
noise ;  the  child  should  sit  with  his  back  to  the  examiner.  The  test 
may  be  informal,  the  child  being  told  to  "play  the  game  just  as  you 
did  yesterday,  except  that  this  time  you  tell  me  whether  the  last 
sound  is  going  up  or  down."  Instructions  are  to  be  repeated  until 
the  child  indicates  he  understands. 

DISADVANTAGE  OF  THE  PITCH  TEST 

Perhaps  the  greatest  disadvantage  of  the  pitch  test  for  children 
described  above  is  the  method  of  sound  production.  Despite  the  ease 
of  handling  the  equipment  and  presenting  the  series,  it  can  hardly 
be  expected  that  all  examiners  will  strike  the  bells  in  the  same 
manner  or  will  exercise  the  same  care  throughout  the  test.  Since  the 
production  of  sound  intervals  must  be  as  carefully  controlled  as 
possible,  a  recorded  test,  especially  with  present  excellent  methods 
of  sound  recording,  would  undoubtedly  be  superior  to  the  present 
test  provided  such  recording  takes  into  account  the  factor  of  pace 
with  young  children.  Recommendations  for  a  recorded  test  for 
young  children  are  presented  in  the  Appendix. 


PART  THREE 

A  NOTE  REGARDING  THE  PSYCHOPHYSICAL 

ANALYSIS  OF  PITCH  DISCRIMINATION 

IN  YOUNG  CHILDREN 

BY 

Harold  M.  Williams 

AND 

Melvin  S.  Hattwick 


A  NOTE  REGARDING  THE  PSYCHOPHYSICAL 

ANALYSIS  OF  PITCH  DISCRIMINATION 

IN  YOUNG  CHILDREN 

A  psychophysical  analysis  of  the  data  on  pitch  discrimination 
collected  by  Hattwick  (2)  offers  certain  points  of  interest.  The 
aspects  of  the  problem  to  be  considered  in  this  note  are  (1)  the 
determination  of  group  thresholds  by  two  methods,  (2)  the  good- 
ness of  fit  of  the  empirical  group  curves  to  the  normal  probability 
integral,  and  (3)  the  distribution  of  individual  thresholds.  The 
determination  of  group  thresholds  yields  information  regarding 
normal  pitch  discrimination  at  the  ages  considered  and  evidence 
regarding  growth  with  age.  The  estimation  of  the  goodness  of  fit  of 
the  empirical  curves  offers  evidence  regarding  the  degree  to  which 
the  mathematical  assumptions  of  the  constant  method  have  been 
met  in  these  data.  The  analysis  of  the  distribution  of  individual 
thresholds  may  be  used  as  a  further  check  on  the  validity  of  the 
test  as  a  measure  of  pitch  discrimination. 

DETERMINATION  OP  GROUP  THRESHOLDS 

By  the  Method  of  Constant  Stimulus  Dijfer'ences 

The  Hattwick  (2)  data  follow  the  constant  method  in  that  a 
fixed  series  of  repeated  stimuli  was  used.  They  differ  from  the 
classical  method  in  that  the  variable  stimuli  are  all  of  the  category 
"greater,"  so  that  empirical  values  are  given  for  only  one-half  of 
the  curve.  This  method  necessitates  scoring  as  "right"  and 
"wrong"  and  sets  the  threshold  at  the  75  per  cent  right  point. 
This  is  now  generally  called  (4,  p.  866)  the  "method  of  constant 
stimulus  differences,"  in  which  the  standard  is  presented  first  or 
last  in  random  fashion.  The  data  can  be  treated  mathematically  by 
at  least  two  procedures:  (1)  linear  interpolation  and  (2)  curve 
fitting.  A  comparison  between  the  results  of  these  two  methods  with 
random  numbers  has  been  made  by  Linder  (5)  with  results  negligi- 
bly different. 

For  the  threshold  analvsis  only  those  cases  meeting  Hattwick 's 


78  IOWA  STUDIES  IN  CHILD  WELFARE 

criteria  of  comprehension  were  used,  including  those  having  90  per 
cent  right  responses  on  the  first  stimulus  and  excluding  those  hav- 
ing five  or  more  consecutive  wrong  responses.  The  percentages  on 
which  these  calculations  were  based  were  computed  from  the  data 
of  Table  15  of  Hattwicks  study  (2,  p.  54). 

The  calculations  for  the  method  of  linear  interpolation  follow 
the  procedure  given  by  Brown  and  Thomson  (1,  p.  60-61).  The 
threshold  values  by  grades  for  this  method  are  given  in  the  fol- 
lowing tabulation : 


Grade 

Group  Tlneshold  iu  d.  v. 

V 

3.81 

IV 

3.75 

III 

3.72 

II 

4.91 

By  the  Curve  Fitting  Method 

Since  the  method  of  linear  interpolation  uses  only  two  empirical 
values  and  the  Hattwick  data  give  five  points,  the  group  thresholds 
were  also  calculated  by  a  curve  fitting  method  Avhich  makes  it  possi- 
ble to  utilize  all  the  data.  For  this  purpose  Kelley's  procedure  (3, 
p.  326-330)  seemed  to  be  the  most  convenient. 

In  this  method  the  mean  and  standard  deviation  of  the  theoretical 
curves  were  determined,  and  the  threshold  (75  per  cent  correct 
judgments)  located  at  a  point  one  probable  error  above  the  mean. 
The  tabulation  below  gives  the  statistical  constants  and  the  thres- 
hold values  for  the  theoretical  curves  for  the  four  grade  groups 
considered : 


Con 

stants  for  Theon 

etical  Curve 

Group 

Standard 

Probable 

Threshold 

Jrade 

Children 

Mean 

Deviation 

Error 

in   d.v. 

V 

97 

—6.552 

14.022 

9.458 

2.908 

IV 

100 

—7.733 

16.354 

11.031 

3.298 

III 

102 

—5.580 

14.902 

10.051 

4.471 

II 

48 

— 4.421 

15.657 

10.561 

6.140 

The  thresholds  as  determined  by  both  methods  give  decreasing 
values  as  age  increases.  This  corroborates  Hattwick 's  results  on 
mean  gross  scores.  The  precautions  taken  to  minimize  the  effect  of 
such  factors  as  comprehension,  attention,  and  interest  in  these 
samples  of  data  have  been  discussed  by  Hattwick.  To  the  degree 
that  the  test  really  measured  pitch  discrimination,  it  must  be  con- 
cluded that  there  is  a  real  but  fairly  small  change  with  age  in  the 
function. 


MEASUREMENT  OF  MUSICAL  DEVELOPMExNT        79 

FIT  OF  GROUP  DATA  AND  NORMAL  PROBABILITY 

INTEGRAL 

Do  the  group  data  fit  the  theoretical  normal  probability  integral  ? 
Results  on  this  point  give  a  partial  answer  to  the  question  of 
whether  the  assumptions  underlying  the  constant  method  are  sat- 
isfied by  the  data.  The  best  answer  to  this  problem  would  come, 
of  course,  from  the  study  of  the  data  for  individual  cases,  since 
group  analysis  involves  the  further  assumption  that  the  group  re- 
sults are  normally  distributed.  The  data  obtained  on  any  indi- 
vidual were,  however,  far  too  meagre  to  justify  the  elaborate  curve 
fitting  technique.  An  analysis  of  similar  results  for  individuals  by 
inspection  and  by  the  method  of  linear  interpolation  is  given  later 
in  this  article. 

Pearson's  test  of  goodness  of  fit  by  the  second  method  given  in 
Brown  and  Thomson  (1),  when  the  successive  percentages  are  rela- 
tively independent,  was  applied  to  the  group  data  obtained  by  the 
curve  fitting  technique.  The  P-values,  derived  from  Pearson's 
tables  (6),  are  given  in  the  following  tabulation,  and  the  empirical 
and  theoretical  curves  in  Figure  1. 


Grade 

P-Value 

V 

.148 

IV 

.328 

III 

.252 

II 

.208 

The  fit,  in  most  cases,  is  only  fairly  satisfactory.  As  may  be  seen 
from  Figure  1,  however,  the  poorness  of  fit  is  not  basically  due  to 
pronounced  skew  in  the  empirical  data.  Tendencies  in  the  direc- 
tion of  inversion  seem  to  play  a  more  prominent  part.  The  goodness 
of  fit  does  not  vary  systematically  with  age.  This  would  suggest  that 
any  factors  reducing  the  goodness  of  fit  operate  as  effectively  at 
one  age  as  another, 

DISTRIBUTION  OF  INDIVIDUAL  THRESHOLDS 

An  examination  of  the  individual  curves  of  percentages  revealed 
three  conditions  in  the  data:  (1)  the  conventionally  adequate  curve, 
(2)  curves  containing  inversions,  and  (3)  curves  indicating  that 
the  test  was  not  sufficiently  difficult,  that  is,  curves  which  did  not 
cross  the  75  per  cent  line.  Figure  2  gives  a  sample  curve  of  each 
form  from  the  data.  The  relative  frequencies  of  +he  three  types  of 
curve  by  grades  are  given  in  the  following  tabulation : 


80 


IOWA  STUDIES  IN  CHILD  WELFARE 


B 

UJ 
O 
< 

0 


UJ 

0 
< 

0 


OJ 
0 
< 

01 

0 

1\ 
1  \ 

I    V 

1   \ 

\ 
\ 
\ 

N 

\ 

s. 

V 

^ 

\ 

V 

Ul 

0 
< 

a. 

0 

\ 
\ 

N 

\ 
\ 
\ 
\ 

\ 

\, 

\. 

J 

if) 


000000900         oOp 

xwsia    XN33   ascd 


MEASI'REMENT  OF  MUSICAL  DEVELOPMENT        81 


iHoia    XN33    aocj 


CS 
ft 

s 

o 


o 


3 


5 
be 


82  IOWA  STUDIES  IN  CHILD  WELFARE 


Cc 

mveiit 

ional 

Type 

Inversions 

Too  Easy 

Grade 

Children 

Ni 

nml)er 

Pe 

r  Cent 

Numljer 

Per  Cent 

Number     Per  Cent 

V 

97 

61 

63 

12 

12 

24               25 

IV 

100 

54 

54 

22 

22 

24               24 

III 

102 

73 

72 

14 

14 

15               14 

II 

48 

38 

79 

2 

4 

8               17 

The  category  "inversions"  includes  only  cases  where  the  inversion 
crossed  the  75  per  cent  value.  The  inversions  are  not  particularly 
significant  in  such  a  small  sample  of  trials  per  child.  The  number 
of  children  for  whom  the  test  seemed  too  easy  is  surprising  and 
points  to  the  need  for  a  finer  instrument  for  the  measurement  of 
some  cases  even  at  these  ages.  The  mean  and  standard  deviation  of 
individual  thresholds,  determined  by  linear  interpolation  from  the 
group  of  conventional  curves,  are  given  in  the  following  tabulation : 


Grade 

Mean  Threshold 

Stan 

ida 

rd  Deviation 

V 

5.00 

3.32 

IV 

6.13 

5.63 

III 

6.18 

5.48 

II 

7.55 

7.43 

Figure  3  shows  the  distributions. 


^e 


The  appearance  of  a  substantial  variation  in  individual  thres- 
holds supports  the  view  that  the  test  was  genuinely  measuring 
pitch  discrimination,  since  individual  variation  in  pitch  sensitivity 
within  this  selected  sample  of  cases  would  be  expected.  The  obvious 
skew  is  partly  a  function  of  the  failure  of  the  test  to  distribute 
scores  below  2  d.  v. 

The  applicability  of  psychophysical  procedures  to  young  children 
rests  essentially  on  the  basis  of  their  ability  to  compare  repeatedly 
two  things  under  relatively  constant  instruction.  Simple  choices 
are  continually  being  made  by  children  even  at  the  preschool 
level.  To  cast  these  choices  into  patterns  which  will  be  adequate 
for  psychophysical  analysis  is  another  matter.  The  procedure  used 
by  Ilattwick  seems  to  have  been  fairly  successful  at  the  ages  given. 
While  these  data  are  suggestive,  it  is  probable  that  the  degree  to 
which  the  necessary  conditions  are  met  will  have  to  be  determined 
for  each  new  problem. 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT        83 


X  ^  a  tj 

^j  111  UJ  UJ 

o  0  9  0 

<  <  <  < 

Qi  (K  Of  Qi 

0  0  0  0 


in 

10 


XN3  0     a3d 


a 
o 

> 


=1-1 

o 

« 

a 
O 


^3 


o 


en 

3 

r3 


o 


o 


3 


Q 

CO 
CD 
3 

bo 


84  IOWA  STUDIED  IN  CHILD  AVELFARE 

REFERENCES 

1.  Brown,    William,    and    Thomson,    Godfrey    H. :    The    essentials    of    mental 

measurement.  3rd  od.  Cambridge,  Mass.:  University  Press,  1925.  Pp.  x, 
224. 

2.  Hattwick,  Melvin  S. :  A  genetic  study  of  differential  pitch  sensitivity.   [In] 

Hattwick,  Melvin  S.,  and  Williams,  Harold  M. :  The  Measurement  of 
Musical  Development  II.  Univ.  Iowa  Stud.,  Stud,  in  Cliild  W^elfare,  1935, 
11,  No.  2.  Pp.  100.   (p.  7-68) 

3.  Kelley,  Truman  L.:    Statistical  method.  New  York:   Macmillan,  1923.  Pp. 

xi,  390. 

4.  Kelley,  Truman  L.,  and   Shen,  Eugene:    The   statistical   treatment   of  cer- 

tain typical  problems.  [In]  Murchison,  Carl  [editor] :  The  Foundations  of 
Experimental  Psychology.  Worcester,  Mass.:  Clark  Universitv  Press, 
1929.  Pp.  X,  907.  (p.  855-883) 

5.  Linder,   Forrest   E. :    A   statistical   comparison   of  psyehophysic-al   methods. 

(Univ.  Iowa  Stud,  in  Psychol.,  No.  17)  Psychol.  Monog.,  1933,  44, 
No.  3,  1-20. 

6.  Pearson,  Karl:   Tables  for  statisticians  and  biometricians.  Part  I.  2nd  ed. 

Cambridge,  Eng. :  Cambridge  University  Press,  1924.  Pp.  Ixxxiii,  143. 


PART  FOUR 

IMMEDIATE  AND  DELAYED  MEMORY   OF 

PRESCHOOL  CHILDREN  FOR  PITCH 

IN  TONAL  SEQUENCES 

BY 

Harold  M.  Williams 


IMMEDIATE  AND  DELAYED  MEMORY  OF 

PRESCHOOL  CHILDREN  FOR  PITCH 

IN  TONAL  SEQUENCES 

It  has  been  the  purpose  of  this  study  to  estimate  the  degree  of 
relationship  existing  at  the  preschool  level  between  what  may  be 
called  immediate  and  delayed  recall  of  pitch  sequences.  A  subsid- 
iary purpose  was  to  measure  the  effectiveness  of  a  fairly  intensive 
course  of  training  in  delayed  recall.  Vocal  accuracy  in  immediate 
reproduction  of  the  sequences  in  the  test  for  tonal  reproduction 
previously  reported  by  Williams  (3,  p.  70-79)  served  as  the  cri- 
terion for  immediate  recall.  Accuracy  in  the  independent  singing 
of  comparable  sections  from  learned  nursery  songs  served  as  the 
criterion  for  delayed  recall.  While  voco-motor  control  is  necessarily 
a  factor  in  the  use  of  these  criteria,  it  nevertheless  can  be  considered 
a  constant  in  the  comparisons  made  here. 

EXPERIMENTAL  CONDITIONS 

For  the  immediate  reproduction  the  voice  was  used  as  the  stim- 
ulus. The  child  was  instructed  to  sing  the  brief  tonal  sequences 
immediately  after  the  experimenter.  For  the  delayed  reproduction 
a  pitch  pipe  was  used  to  give  the  keynote.  The  children  were  sim- 
ply instructed  to  sing  the  songs,  starting  at  the  given  pitch.  All 
stimuli  and  reproductions  were  recorded  on  the  dictaphone.  The 
records  were  later  transcribed  by  ear  to  a  special  musical  score.  The 
recording  was  to  the  nearest  half-tone.^ 

The  training  period  was  informal  but  controlled  with  respect  to 
the  number  of  practices  given  the  children.  The  group  method  of 
training  was  used,  the  teachers  presenting  the  song  vocally  with 
the  piano  as  accurately  as  possible  and  then  inviting  the  children  to 
join  them  in  singing  it.  The  etforts  to  obtain  participation  were 
quite  generally  successful  for  all  members  of  the  group.  The  songs 
were  in  every  case  unfamiliar  to  the  children  at  the  beginning  of  the 
training  period.  Two  practices  a  day  were  given  daily  for  each 


1  The  validity  of  this  method  has  already  been  discussed  by  Hattwick  (1,  p.  177-181). 
The  experimenter  (Hattwick)  had  already  shown  an  accuracy  of  vocal  control  and  pitch 
discrimination   approaching  the   accuracy   of  the   machine   itself. 


88 


IOWA  STUDIES  IN  CHILD  WELFARE 


song.  The  test  songs  were  introduced,  along  with  other  songs,  into 
a  regular  singing  period.  This  plan  was  followed  in  order  to  avoid 
as  far  as  possible  the  effects  of  boredom.  The  training  period  in- 
cluded a  total  of  sixty  practices,  extending  over  thirty  school  days. 
This  period  of  six  weeks  seemed  to  represent  the  maximum  time 
which  would  allow  a  sufficient  number  of  practices  while  partially 
avoiding  the  effects  of  boredom  and  maturation. 

While  three  songs  were  introduced  for  the  purposes  of  the  experi- 
ment, the  data  from  only  one  were  complete  enough  to  warrant 
statistical  study.  This  was  "Cock-a-Doodle-Doo,"  the  score  for 
which  is  given  in  Figure  1.  This  song  was  taught  to  the  five-  and 
four-year-old  groups  of  the  Iowa  Child  Welfare  Research  Station 
and  the  University  Elementary  School.  Due  to  absences,  the  num- 
ber of  children  available  for  the  various  computations  varied.  The 
mean  age  of  the  forty-one  children  in  this  study  was  60.3,  the  range 
47  to  70  months. 


Figure  1.    Musical  Score  of  Test  Song  in  Delayed  Eecall  Series 


Objective  measurements  on  achievement  (by  the  dictaphone  tech- 
nique) were  made  before  training  and  after  ten,  thirty-one,  and 
sixty  practices.  The  children  were  tested  individually  in  a  small 
testing  room.  Since  they  had  already  been  taught  the  words  of  the 
songs,  the  preliminary  tests  were  conducted  by  having  the  experi- 
menter sing  each  song  to  the  child  in  sections  as  indicated  by  the 
brackets  in  Figure  1,  and  requiring  the  child  to  sing  each  section 
immediately  after  him.  That  this  first  presentation  represented  es- 
sentially a  test  of  immediate  reproduction  is  logically  obvious.  This 
approach,  however,  seemed  the  most  rea.sonable,  since  it  offered  a 
further  means  of  measuring  the  reliability  of  the  test  of  immediate 
recall. 

METHOD  OF  SCORING 

Two  types  of  score  are  possible  from  the  records.  In  the  first 
place,  one  can  simply  record  the  number  of  correct  responses,  tone 


MEASUREMENT  OF  MUSICAL  DEVELOPMENT 


89 


by  tone  (correct  responses).  A  second  weighted  score  may  be  ob- 
tained by  recording  for  each  tone  the  amount  of  error  in  half-tone 
units  and  assigning  the  cumulative  total  as  a  score  (error  score). 
According  to  the  latter  method  of  scoring,  the  near  monotone,  for 
example,  will  be  more  heavily  penalized  than  the  child  who  makes 
only  small  or  occasional  errors.  The  correlation  on  the  reproduction 
(Williams)  series  between  the  correct  and  the  error  score  was 
— .791,  probable  error  .040,  for  forty  children  on  the  first  adminis- 
tration and  — .974,  probable  error  .007,  for  twenty-seven  children 
on  the  second  administration.  This  indicated  a  substantial  similarity 
in  result  from  both  methods  of  scoring.  In  this  study  the  weighted 
error  scores  were  used  throughout. 

The  weighted  error  scores  of  the  delayed  memory  series  may  be 
computed  from  the  standard  tone  as  a  base  or  from  the  child's  own 
keynote.  The  scoring  for  the  delayed  recall  series  was  computed 
from  the  notes  indicated  by  the  first  and  last  brackets  in  Figure  1, 
omitting  the  middle  group. 

LEARNING  CURVES 

The  group  means  and  standard  deviations  for  the  error  scores  for 
the  tests  of  delayed  recall  are  given  in  Table  1  for  two  groups,  one 
of  twenty-one  children  having  complete  records  for  the  whole  series 
and  one  of  thirty-one  children  with  records  on  only  the  ten  and 
thirty-one  trial  practice  periods.  The  scores  are  tabulated  from  two 
bases,  the  stimulus  and  the  child's  own  kevnote. 

The  general  characteristics  of  the  learning  curve  represented 

TABLE  1 
Group  Error  Scores  in  Delayed  Recall  for  Children  Completing  the  Training 

Series 


Test 


21  Children' 


Mean 


Standard 
Deviation 


31  Children* 


Mean 


Standard 
Deviation 


Standard  Stimulus 


Preliminary 
After  10  practices 
After  31  practices 
After  60  practices 


51.6 
.53.9 
37.6 

34r.O 


21.4 
30.0 
39.1 
25.0 


53.9 
57.9 
38.0 


26.0 
28.6 
35.3 


Child's  Own  Keynote 


Preliminary 
After  10  practices 
After  31  practices 
After  60  practices 


60.0 
42.6 
30.0 
44.7 


.31.6 
26.6 
28.0 
37.6 


.58.5 
46.3 
32.2 


33.3 
26.0 
28.6 


*  Mean  Age,  61.8  Months;   Age  Range,  47-69  Months. 
**  Mean  Age,  61.9  Months;  Age  Range,  47-70  Months. 


90  IOWA  STUDIES  IN  CHILD  WELFARE 

in  the  data  of  Table  1  may  be  seen  from  Figure  2.  That  the  first 
presentation  represented  largely  a  test  of  immediate  reproduction 
has  already  been  pointed  out.  It  will  be  remembered  that  on  all 
subsequent  tests  the  child  sang  unaided  except  for  the  sounding  of 
the  keynote.  Under  these  circumstances  the  lack  of  improvement, 
as  shown  in  Figure  2,  after  ten  trials  is  not  surprising.  The  poorer 
section  of  the  group  (-j-  1  standard  deviation)  actually  did  less 
well  at  the  tenth  trial,  although  the  better  section  ( — 1  standard 
deviation)  was  more  accurate  at  the  tenth  trial.  This  indicates  the 
rapidity  of  the  learning  in  the  better  group.  The  age  means  and 
ranges  for  the  groups  of  thirty-one  and  twenty-one  children  each 
are  practically  identical,  showing  that  the  data  from  the  two 
groups  are  quite  comparable  as  far  as  selection  in  terms  of  age  is 
concerned.  The  loss  in  numbers  resulted  entirely  from  the  acciden- 
tal circumstance  of  absence  on  account  of  illness  or  other  factors. 

Marked  improvement  in  absolute  error  scores  is  shown  from  the 
tenth  to  the  thirty-first  trial,  as  may  be  seen  from  Figure  2.  The 
mean  improvement  from  the  thirty-first  to  the  sixtieth  trial  is  much 
less.  By  the  thirty-first  trial  the  better  section  had  learned  the  song 
to  practical  perfection  and  actually  did  worse  at  the  sixtieth  trial. 
This  is  interpreted  as  the  result  of  boredom  with  the  song. 

The  curves  calculated  from  the  standard  and  from  the  child's 
own  keynote  show  one  systematic  difference  in  that  from  the  be- 
ginning there  is  improvement  in  the  latter.  Scores  obtained  by  cal- 
culating from  the  child's  own  keynote  reveal  the  acquisition  of  a 
pattern  more  definitely  than  do  scores  calculated  from  the  standard, 
where  a  false  start  with  perfect  pattern  might  still  yield  a  large 
error  score.  On  this  basis  it  is  suggested  that  the  curve  for  the 
former  offers  evidence  of  improvement  in  the  acquisition  of  the 
pattern  as  such  in  the  first  ten  trials. 

The  mean  error  scores  and  standard  deviations  for  the  two 
applications  of  the  Williams  test  (3)  for  reproduction  are  given  in 
the  following  tabulation.  These  results  are  from  the  twenty-seven 
children  who  were  given  both  tests. 


Children 

First  Test 

Final  Test 

Mean  Error  Score 

31 

107.4 

113.4 

Standard  Deviation 

21 

63.1 

69.9 

The  practice  in  learning  the  melody  resulted  in  no  group  gain  on 
the  test  for  reproduction.  The  slight  loss  may  be  attributed  to  bore- 
dom and  may  parallel  the  loss  in  learning  achievement  from  the 


MEASUREMENT  OF  MUSICAL  DEVELOPMExNT        91 


0 

\ 

\ 

1 

■  ^ 

\l\l 

\ 

<^  1  '2  i 
1  2  i  ; 

:>    2    2    ^ 

u     M    r!     n 

• 

f— 1 
C3 

\  J 

2    9   3    5 

<u 

\i 

5    ?   i    ^ 

w 

K 

5  5  o  u 

ns 

1  \ 

?3    (VI    (f^    t<^ 

I  ! 

1    i 

\ 

,3 

\ 

1    1 

i      \ 

o 

CO 

^       /     ' 

.  f^ 

o 

^  // 

// 

// 
/  / 

// 

in 

> 

/  / 

// 

O 
bo 

/ 

/ 

/      / 

f     t 

/ 

§ 

5 

/    / 
/    / 

/  / 

f 

/  / 

// 

3 
bc 

\     \ 

/y 

\    \ 

^ 

< 

^  i/' 

a 

>A 

s 

//\  \ 

-  -  0  lu 



-J     3 

-3 

%            I           %           %            I 

gaoos   ?Joaa-3     nv3i^ 

92 


IOWA  STUDIES  IN  CHILD  WELFARE 


thirty-first  to  the  sixtieth  trial.  There  is  no  evidence  here  of  trans- 
fer from  one  ability  to  the  other.  These  results  are  in  agreement 
with  the  findings  of  Jersild  and  Bienstock  (2). 

CORRELATIONS  BETWEEN  IMMEDIATE  AND  DELAYED 

RECALL 

The  reliability  for  an  immediate  repetition  of  the  Williams  test 
for  tonal  reproduction  has  been  reported  (3,  p.  78)  as  being  of  the 
order  of  .90  or  better  for  groups  comparable  to  those  used  in  the 
present  study.  The  correlation  for  the  "error  score"  between  the 
initial  and  final  application  of  the  test  after  the  training  period 
reported  here  was  .698,  probable  error  .066,  for  twenty-seven  chil- 
dren. The  correlation  for  the  "correct  response"  score  was  .774, 
probable  error  .052,  for  the  same  children. 

There  seems  to  be,  therefore,  a  decrement  in  reliability  or  pre- 
dictive power  of  the  test  for  immediate  reproduction  when  this  is 
measured  over  a  six  week  period  such  as  that  covered  in  the  present 
study.  An  inspection  of  individual  scores  showed  that  gains,  losses, 
and  no  change  in  score  on  the  test  were  about  equally  distributed. 
The  split-half  reliability  of  the  test  of  reproduction  in  the  delayed 
recall  series  (first  application  of  the  test)  was  .869,  probable  error 
.029,  for  thirty-three  cases.  By  the  Spearman-Brown  formula  this 
reliability  became  .929. 

The  correlations  between  the  error  scores  for  the  Williams  tests 
of  immediate  recall  and  the  tests  of  delayed  recall  are  given  in 
Table  2. 

On  the  basis  of  these  results,  it  may  be  said  that  the  reliability 

TABLE  2 
Correlations  Between  Tests  of  Immediate  and  Delayed  Eecall  for  Thirty-One 

Children 


Test 

First 

Second 

r 

P.E. 

r 

P.E. 

Standard  Stimulus 

Preliminary 
After   10   trials 
After  31   trials 

.536                   .086 
.582                   .080 
.630                   .073 

.595 
.566 
.806 

.078 
.082 
.042 

Child's  Own  Keynote 

Preliminary 
After   10   trials 
After   31   trials 

.187                   .117 
.274                   .112 
.281                    .111 

.301 

.388 
.560 

.110 
.102 
.083 

MEASUREMENT  OF  MUSICAL  DEVELOPMENT        93 

of  the  test  of  immediate  reproduction  of  pitch  is  only  slightly 
srreater,  after  a  six  week  interval,  than  the  correlation  of  this  test 
with  scores  on  delayed  recall  after  practice,  when  both  are  measured 
from  a  given  standard  tone.  These  correlations  drop  materially 
when  the  scores  after  practice  are  measured  from  the  child's  own 
keynote.  This  is  interpreted  as  evidence  of  a  considerable  degree 
of  independence  between  two  abilities,  one  of  which  may  be  de- 
scribed as  the  immediate  reproduction  of  pitch  sequences  and  the 
other  as  the  ability  to  carry  in  memory  a  tonal  pattern.  The  differ- 
ences already  shown  between  the  learning  curves  calculated  from 
the  standard  stimulus  and  from  the  child's  own  keynote  are  con- 
sistent with  this  finding. 

The  correlations  between  scores  on  the  Williams  tests  of  immedi- 
ate reproduction  and  successive  gains  in  the  learning  scores  were 
also  calculated  for  the  thirty-one  children  who  completed  thirty- 
one  practice  trials.  The  results  are  given  in  the  following  tabula- 
tion : 

Gain  at  Gain  of  Thirty-First 


illiams  ' 

rest 

Tenth  Trial 

Over  Ten 

ith  Trial 

r             P.E.^ 

r 

P.E.^ 

Standard  Stimulus 

I 

.097         .120 

.131 

.119 

II 

.294         .111 
Child's  Own  Keynote 

.38.5 

.110 

I 

—.051         .121 

.05.5 

.121 

II 

—.164         .118 

.223 

.115 

These  are,   in   a  sense,   the   crucial  correlations   and   add   further 
evidence  of  the  relative  independence  of  the  two  abilities. 

CONCLUSIONS 

The  following  conclusions  are  drawn  from  the  data  presented. 

1.  Under  controlled  conditions,  great  individual  differences  ex- 
isted in  the  ability  of  this  group  of  young  children  in  immediate 
reproduction  of  tonal  sequences  and  in  ability  to  learn  to  repro- 
duce a  simple  melody  in  delayed  recall.  Both  abilities  ranged  from 
an  achievement  which  was  little  more  than  what  might  be  achieved 
by  random  effort  to  a  practically  perfect  rendition. 

2.  Group  improvement  in  delayed  recall  was  apparent  over  the 
practice  period.  There  was  some  indication  of  more  rapid  early  im- 
provement in  the  learning  of  pattern  than  in  the  absolute  reproduc- 
tion of  pitch.  There  was  a  slight  loss  as  the  result  of  prolonged  prae- 


94  IOWA  STUDIES  IN  CHILD  WELFARE 

tice.  There  was  no  evidence  in  the  group  results  of  transfer  to  imme- 
diate reproduction  of  pitch. 

3.  A  test  of  immediate  recall  predicts  the  relative  position  of 
children  after  varying  degrees  of  training  with  correlations  of  the 
order  of  .60  when  these  results  are  calculated  from  the  given 
standard  tone  as  a  base.  These  correlations  drop  to  the  order  of  .20 
when  the  child's  own  keynote  is  used  as  a  base.  The  correlations 
between  ability  in  immediate  reproduction  and  gain  in  delayed 
recall  were  negligible. 

4.  These  results  suggest  that  there  is  a  considerable  degree  of  in- 
dependence between  ability  in  immediate  and  ability  in  delayed 
reproduction  of  pitch  sequences  at  the  preschool  level. 

REFERENCES 

1.  Hattwick,  Melvin  S. :  A  preliminary  study  of  pitch  inflection  in  tlie  speech 

of  preschool  children.  [In]  Williams,  Harold  M.,  Sievers,  Clement  H., 
and  Hattwick,  Melvin  S. :  The  Measurement  of  Musical  Development. 
Univ.  Iowa  Stud.,  Stud,  in  Child  Welfare,  1932,  7,  No.  1,  Pp.  191.  (p. 
173-191) 

2.  Jersild,  Arthur  T.,  and  Bienstock,  Sylvia  F. :   The  influence  of  training  on 

the  vocal  ability  of  three-year-old  children.  Child  Develop.,  1931,  2,  272- 
291. 

3.  Williams,  Harold  M. :  Studies  in  the  measurement  of  musical  development. 

[In]  Williams,  Harold  M.,  Sievers,  Clement  H.,  and  Hattwick,  Melvin 
S. :  The  Measurement  of  Musical  Development.  Univ.  Iowa  Stud.,  Stud, 
in  Child  Welfare,  1932,  7,  No.  1,  Pp.  191.  (p.  9-107) 


APPENDIX 


APPENDIX 

EXAGGERATION  OF  INTERVALS 

TAventy-two  preschool  children,  ages  four  years,  two  months  to 
five  years,  eleven  months,  who  had  scored  at  least  twenty-three  out 
of  twenty-five  correct  responses  on  the  intervals  of  the  first  main 
interval  singing  test  and  who  could  sing  directionally  what  they 
heard  in  the  Ediphone  test  were  later  given  individual  tests  of  pitch 
discrimination.  The  twenty-five  trials  used  bells  as  stimuli ;  the 
following  intervals  were  used :  21,  18,  14,  10,  6,  4,  3  d.  v. 

Only  nine  of  the  twenty-two  children  exaggerated  either  the 
larger  or  smaller  intervals  to  a  perceptible  extent.  As  far  as  the 
experimenter  could  tell,  thirteen  sang  accurately  even  on  the  smaller 
intervals.  The  examiner  could  not  determine  reliably  whether  the 
intervals  sung  below^  3  d.  v.  Avere  sung  on  pitch  or  not  since  the 
exaggerated  differences  were  so  small. 

PROCEDURE  AND  INSTRUCTIONS  FOR  THE  SECOND 
INTERVAL  SINGING  TEST 

Each  child  was  brought  into  the  room  and  seated  at  a  table  on 
the  opposite  side  from  where  the  experimenter  sat.  An  informal 
period  of  a  few  minutes  was  devoted  to  establishing  rapport. 

After  eight  trials  of  illustrated  instructions  the  child  was  given 
a  ten  trial  test.  If  he  responded  correctly  on  nine  of  the  ten  trials, 
he  was  considered  testable  for  pitch  discrimination.  If  he  scored 
less  than  nine  successes  on  ten  trials,  the  experimenter  gave  him 
ten  practice  trials  on  the  30  d.  v.  interval,  recording  the  verbal 
responses.  If  the  child  made  a  mistake  in  singing,  the  experimenter 
said,  ''No,  the  bells  didn't  ring  'ding-ding.'  Listen."  The  experi- 
menter rang  the  bells;  then  sang  like  the  bells.  "That's  the  way 
the  bells  rang.  Now  we'll  try  some  more,  and  be  sure  to  sing  just 
like  the  bells."  At  tlie  end  of  the  ten  trial  practice  period  a  test 
series  of  ten  trials  was  given.  If  the  child  still  scored  less  than  nine 
out  of  ten  successes,  he  was  given  another  practice  series  of  ten 
trials  and  a  final  series  of  ten  test  trials. 


98  IOWA  STUDIES  IN  CHILD  WELFARE 

TYPICAL  PROCEDURE  AND  INSTRUCTIONS  USED  IN 

GROUP  TESTING 

The  children  were  seated  alternately  and  provided  with  pencils 
and  appropriate  blanks.  Instructions  were  as  follows : 

"Going  Up  —  Going  Down"  Terminology 

' '  I  want  you  to  listen  very  carefully  because  this  is  going  to  be  a 
hearing  game."  The  w^ord  test  was  used  in  place  of  game  for  the 
fifth  grade.  "You  will  hear  two  sounds,  one  right  after  the  other, 
coming  from  the  Victrola.  Sometimes  the  last  sound  will  be  going 
up,  sometimes  it  will  be  going  down.  Listen."  The  examiner  played 
four  intervals,  stating  the  correct  response  after  each.  "Now  be- 
fore we  start  the  real  game  we'll  have  some  practice  trials  to  show 
you  how  it  goes.  I'll  play  some  sounds  and  you  will  tell  me  which 
way  they  are  going,  up  or  down,  and  I  '11  write  the  answers  on  the 
large  form  here  on  the  board.  Now  listen  carefully." 

The  examiner  presented  at  least  five  intervals  saying  after  the 
group  response,  "Yes,  going  up  (or  down)"  and  writing  "U"  or 
"D"  in  the  correct  square  of  the  reproduced  form  on  the  black- 
board. At  the  end  of  the  practice  period  he  said  to  the  children, 
"Does  every  one  understand  how  to  play  the  game?  Don't  be 
afraid  to  raise  your  hand  if  you  have  questions."  (Pause)  Usually 
there  were  several  who  had  questions,  and  these  were  answered 
before  proceeding  with  the  actual  test.  The  questions  were  recorded. 

The  examiner  then  said,  "Now  take  your  pencils  and  get  ready 
to  start  on  row  number  1.  Go  right  across  the  page  just  as  you  do 
in  reading.  If  the  sounds  are  going  up,  write  a  U  (illustrating)  in 
the  square ;  if  the  sounds  are  going  down,  write  a  D  (illustrating) 
in  the  square.  Remember  the  rest  of  your  grade  was  just  in  here 
(or  wuU  be  in  here  next)  and  I  want  to  see  if  you  can  do  as  well 
as  they  did  (or  will  do).  Ready?" 

After  each  ten  trials  the  test  was  stopped,  and  the  examiner 
said,  "Remember  we  have  to  be  very  quiet  so  everyone  can  hear 
well.  Now  start  on  row  number  2.  If  the  sounds  are  going  up,  mark 
U  in  the  square,  if  they  are  going  down,  mark  D.  Ready?  Very 
quiet  now." 

A  description  of  Forms  A,  B,  and  C  are  given  on  page  32. 
The  procedure  for  recording  Forms  B  and  C  was  similar  to  the 
above  except  for  the  necessary  changes  in  verbal  instructions  due  to 
the  method  of  recording. 


MEASUREMENT  OP  MUSICAL  DEVELOPMENT        99 

*  *  Higher-Lower ' '  Terminology 

Instructions  paralk-led  those  for  "going  up  —  going  down"  ex- 
cept for  the  substitution  of  "higher-lower"  wlierever  "going  up  — 
going  down"  occurred. 

A  RECORDED  TEST  OF  PITCH  DISCRIMINATION  FOR 

YOUNG  CHILDREN 

The  advantages  of  a  recorded  test  of  pitch  discrimination  for 
young  children  are  (1)  simplicity  of  test  administration,  (2) 
standardization  of  sound  production,  and  (3)  eventual  monetary 
econom3^  The  disadvantages  are  (1)  introduction  of  extraneous 
record  noises  (in  comparison  to  the  sound  produced  by  a  tuning 
fork  or  bar)  and  (2)  difficulties  in  varying  the  pace  of  the  test 
to  meet  the  demands  of  children's  groups. 

Present  methods  of  recording  reduce  the  first  disadvantage  to  a 
minimum,  but  the  last  is  more  difficult  to  overcome.  It  is  possible, 
however,  to  make  a  recorded  test  which  will  meet  the  requirements 
of  the  present  bell  test  to  a  large  degree.  In  addition,  if  proper 
arrangements  were  made,  the  children's  test  could  be  recorded  on 
the  same  disc  as  that  for  adults,  as  suggested  in  the  following  para- 
graphs. 

The  present  Seashore  test  of  pitch  discrimination  presents  the  in- 
tervals in  too  rapid  succession  for  certain  percentages  of  the  fifth, 
fourth,  third,  and  second  grade  children.  A  comparison  of  the 
average  time  taken  by  the  groups  when  setting  their  own  pace  in 
the  test  with  that  of  the  present  recorded  test  indicates  that  an 
additional  three  or  four  seconds  between  each  test  trial  would  pace 
the  test  so  that  every  child  would  have  time  to  respond  to  each 
individual  trial.  Such  an  extension  between  trials  should  not  result 
in  a  less  reliable  test  for  adults. 

Since  the  Seashore  pitch  test  runs  continuously  except  for  chang- 
ing from  one  side  of  the  record  to  the  other,  it  is  practically  impos- 
sible to  give  selected  series  from  the  record  (e.g.,  to  present  the  23, 
12,  8,  and  2  d.  v.  ten  trial  series  without  the  other  series).  If  a 
narrow  nonrecorded  section  were  left  between  each  ten  trial  series, 
the  examiner  could  then  choose  the  intervals  for  the  children's  test 
(five  ten-trial  series)  from  the  adult  test  (ten  ten-trial  series). 
After  each  ten  trial  series  in  the  children's  test  is  completed,  the 
needle  could  be  lifted  and  placed,  when  ready,  on  the  following 
appropriate   series.   Certain   cautions  or   commendations  could  be 


100  IOWA  STUDIES  1\  CHILI)  WELFARE 

made  during  this  intermission  and  instructions  repeated  before 
going  to  the  following  series.  The  value  of  breaking  up  the  fifty 
trial  test  into  units  of  ten  trials  each  has  been  discussed  and  recom- 
mended in  the  first  study  in  this  monograph. 

The  revised  test  of  pitch  discrimination  for  adults  and  older 
children  to  be  recorded  soon  does  not  include  the  30  or  17  d.  v, 
intervals  of  the  test  for  children  suggested  here.  A  15  d.  v.  interval 
might  be  substituted  for  the  17  d.  v.  interval  used  in  the  present 
test,  but  it  is  strongly  recommended  that  the  large  30  d.  v.  interval 
be  included  in  the  test.  This  interval  has  little  discriminative  value, 
but  for  demonstrative  and  instructional  purposes  with  children  a 
large  easily  heard  interval  is  strongly  recommended. 

Bv  allowing  a  few  seconds  more  time  between  individual  trials 
and  by  placing  a  narrow  nonrecorded  section  between  each  ten 
trial  series  on  the  record,  the  forthcoming  revision  for  adults  and 
older  children  may  well  be  used  as  a  test  for  younger  children.  In 
addition  to  permitting  a  short  group  test  for  children  (fifty  trials, 
using  the  30,  15,  8,  4,  and  2  d.  v.),  such  a  recording  would  permit 
the  further  advantage  of  concentrating  on  the  important  intervals 
in  individual  tests  to  the  exclusion  of  other  intervals. 


/ 


( 


R.W.B.    JACKSON   IIBRARV 


MM 


3    0005    030 


bTM    5 


^rvoFiOV^^ 


.v 


I6UU 
v.ll,   no. 2 

T  owa •   Uni ve  r  s  i ty 

University  of  Iowa  studies  in 

child  welfare 


I6UU 
^'-.n,   no.? 

Iowa.   IJniversitv 

Univeraity  of  Iowa  studies  in  child 
welfare 


